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EFFICIENT DIRECTIONAL GENETIC CLONING SYSTEM 



BACKGROUND OF THE INVENTION 
FIKT,n OF THE TOVENTTON 

The present invention relates to vectors 
for molecular cloning of DNA segments, particularly 
to cloning vectors employing non-symmetrical 
restriction enzyme recognition sites for insertion 
of DNA segments in a defined direction relative to 
vector. This invention also relates to use of such 
vectors in methods for efficient cloning of genomic 
DNA segments and of complementary DNA (cDNA) copies 
of messenger RNA (mRNA) molecules from eukaryotic 
genes, and to the manufacture and use of novel 
products related to these vectors and methods. 

BACKGROU ND OF THE TNVENTTON 

The development of DNA cloning techniques 
for complementary DNA (cDNA) copies of messenger RNA 
(mRNA) molecules has been of great value in the 
study of eukaryotic genes. In many cases, the 
amount of a given mRNA for which cDNA clones are 
desired is limited by the availability of 
appropriate tissue sources and/ or a low 
concentration of that specific mRNA in those 
sources. Therefore, readily obtainable sources may 
provide only a few copies of a given mRNA molecule 
from which cDNA clones might be produced. 

The requirements for any efficient method 
for cDNA cloning may be generally summarized as 
follows: first, full-length double-stranded cDNAs 
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must be pr duced from the mRNA with high yi Id; th 
ends of th resulting DNA fragments must be made 
capabl f being j ined efficiently to the vector 
DNA by enzymatic ligation; pr duction of undesirable 
5 ligation byproducts must be minimized; and, 

preferably, insertion of the cDNA into the vector 
DNA should provide expression of the cDNA to 
facilitate detection of the desired clone by means 
of the product. 

10 Production of the protein product may be 

necessary for, detecting a gene when no nucleic acid 
probes for the desired gene are available. More 
generally, such expression of the protein is 
desirable because, in terms of copy number, the 

15 protein provides a molecular signal that is greatly 
amplified in relation to the DNA molecules of the 
cloned gene inside the host cell. 

As it is difficult to achieve high 
efficiency of conversion of mRNA molecules into 

20 full-length cDNA clones, especially when the mRNA of 
interest is relatively long, several refinements in 
cDNA cloning strategy have been made. Among them, 
the Okayama-Berg method significantly improved the 
efficiency of full-length cDNA cloning. 

25 The Okayama-Berg approach has several 

advantages over previous, conventional methods for 
cloning cDNAs. The following section is intended to 
highlight these advantages in relation to the main 
steps of this complicated method. For a more 

30 complete and detailed description of the method, see 
the original publication [Okayama, H. and Berg, P. 
(1982) Mol. Cell. Biol. 2, 161-170], which is hereby 
incorporated herein by reference. 

The main advantages of the Okayama-Berg 

3 5 method for cDNA clone relate to the fact that as 
part of the processing needed to form mRNAs, 
transcripts of eukaryotic genes undergo enzymatic 
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addition f multiple aden sine residues at the 3' 
end, thereby acquiring what is known as a "poly (A) 
tail". In the present cont xt, the t rm mRNA 
encompasses any RNA speci s from any source, natural 
or synthetic, having a 3» poly (A) tail c mprising 
two or more adenosine residues* 

In the original Okayama-Berg approach, 
synthesis of the first DNA strand from the mRNA 
template is initiated by annealing the 3' poly (A) of 
the eukaryotic mRNA to an oligo(dT) primer which 
forms an extension of one end of a DNA strand of the 
cloning plasmid. First strand cDNA synthesis by 
this "plasmid-priming" method directs the 
orientation of the sequence within the cDNA into a 
unique relationship with the sequence in the 
plasmid; hence, this approach has been called 
"directional" cloning. Directional cloning ensures 
that every cDNA clone that is formed will be 
correctly oriented for a promoter provided in the 
cloning plasmid (an SV40 promoter in the original 
Okayama-Berg system) to drive transcription of the 
proper cDNA strand to produce RNA with the correct 
sense for translation into the protein encoded by 
the original mRNA template. 

To provide high efficiency of ligation in 
cloning DNA segments in general, restriction 
nucleases are utilized to produce short single- 
stranded ends on the DNA that are complementary in 
base sequence to any other DNA end produced by the 
same enzyme. Accordingly, these single-stranded 
ends can anneal together by forming specific DNA 
base pairs, or, in the vernacular, they are 
"sticky". This annealing greatly enhances the rate 
of joining DNA segments by enzymatic ligation and 
further provides a means for selectively joining 
ends of segments treated with the same enzyme. 
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In the original Okayama-Berg neth d, after 
synthesis f th first cDNA strand, an ligo(dG) 
tail is attached enzymatically to the free end of 
the plasraid-primed cDNA, and then the plasmid is 
5 cleaved by a restriction enzyme (Hindlll) to produce 
a sticky end on the plasmid opposite to the end 
where the cDNA is attached. A short DNA fragment 
( " linker" ) , which contains the SV40 promoter and has 
a cleaved Hindi 1 1 site on one end and oligo(dC) on 

10 the other, is then attached to the cDNA-plasmid 

molecule by ligation, to circularize the molecule. 

In other, more conventional methods a 
(synthetic) linker may also be used to clone cDNAs, 
but it is attached aftep second strand DNA synthesis 

15 and further enzymatic repair which is necessary to 
form perfectly matcfeed strands (i.e., a "flush" or 
"blunt" end) . To protect internal restriction sites 
of the double-stranded cDNA from cleavage by the 
restriction enzyme required to allow ligation of the 

20 vector and linker, prior to addition of the linker, 
the cDNA is methylated with the appropriate DNA 
modification system associated with the given 
restriction enzyme. However, such protection may 
not be absolute; thus, internal sites may be cleaved 

25 at some frequency due to an incomplete methylation 
reaction. In contrast, in the original Okayama- 
Berg method, this problem of internal cleavage of 
cDNAs is obviated by cleavage of HlndlJ.1 sites on 
the vector when the cDNA is represented as an 

3 0 RN A : DNA hybrid that resists restriction. 

The.Okayama-Berg approach provides yet 
another advantage over previous methods in which 
both ends of separately synthesized cDNAs are 
ligated to the vector ends at the same time, namely 

35 that according to Okayama-Berg , the necessary 

circularization of the vector DNA with the cDNA 
attached at one end is relatively efficient via the 
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link r because nly one juncture betwe n the cDNA 
and vector m lecules remains to undergo ligation. 

Furthermore, the vera 11 Okayama-Berg 
approach offers additional advantages over previous 
5 methods, p llowing circularization, a process 

called "rna nick translation" using DNA polymerase I 
and RNase h is used which facilitates complete 
synthesis of the second strand along the entire 
first strand. This process overcomes the inherently 
10 low processivity of DNA polymerase I by using 

multiple sites for priming of second strand DNA 
synthesis with DNA primer fragments having random 
sequences . 

Finally, since the Okayama-Berg vector has 
15 already been joined to the cDNA when the second 

strand is synthesized, truncation of cDNA molecules 
close to the 3* end of the cDNA generally does not 
occur, in contrast to other methods in which the 
second strand is completed while the 3« end of the 
20 first strand is free and, therefore, more 

susceptible to damage from nuclease activities. 

Cloning vectors based on bacteriophage A 
are also known. The second strand synthesis 
reaction of the Okayama-Berg method has also been 
25 utilized in a simpler cloning procedure [Gubler, U. 
and Hoffman, B. J. (1983) Gene 25, 253-269], 
allowing cDNA cloning in such A vectors [Huynh, 
T.V. , Young, R. a. and Davis, R.w. (1985) in DNA 
Cloning, A Practical Approach, ed. Glover, D. (IRL, 
30 oxford), Vol. i, pp. 49-78]. This A-based cDNA 

cloning method has been widely used, mainly due to 
the high efficiency of transmission of recombinant 
DNA into cells by means of infectious phage 
particles, which are produced with in vitro DNA 
35 "packaging" systems. A phage cloning systems also 

offer. convenient clone screening capabilities due to 
tolerance of a high density of A plaques on test 
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plates to be screened, compared with most plasmid 
systems which permit nly 1 wer densiti s of 
bacterial host col nies. 

Early A systems for cDNA cloning , however, 
5 while retaining the second strand synthesis strategy 
of the original Okayama-Berg plasmid method, lack 
some of its other advantages. For example, 
directional cloning is not possible in those 
original A systems. In addition, multiple inserts 

10 and truncated cDNAs are frequently obtained. 

Further, despite the high packaging efficiency for 
native A DNA molecules , the packaging efficiency of 
recombinant DNA molecules that are produced by 
cleavage of intact linear A molecules and ligation 

15 with cDNA fragments is usually low compared to that 
of intact A DNA. 

Recently, directional cloning capabilities 
have been introduced into various A vectors. For 
example, one such directional A vector employs a 

20 site for insertion of DNA segments that comprises 
two different restriction enzyme cleavage sites 
[Meissner, P. S. , et al. (1987) Proc. Nat. Acad. 
Sci. USA, 84, 4171-4175]. The cDNA molecules are 
primed with oligo(dT) , made double-stranded, and 

25 then methylated with the enzymes needed for 

protection against internal cleavage by both of the 
nucleases used in the DNA insertion site of the 
vector. A linker segment containing a cleavage site 
for only one of the nucleases of the insertion site 

30 is added to both ends of the cDNA. The combination 
of the last two A:T base pairs on the 3' end of the 
cDNA with the sequences at one end of the linker, 
however, creates a cut site for the other of the two 
nucleases of the insertion site. Thus, after 

35 restriction, with both nucleases of the insertion 

site, the individual cDNA segments can ligate into 
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the vect r nly in a single directi n with respect 
to th tw different cleavage sites in the vector. 

Vari us general disadvantages f this 
particular approach for cDNA cloning in A phage, 
5 compared t the Okayama-Berg plasmid method, have 
been described above in relation to other systems; 
and other problems specific to this approach have 
been noted [Meissner, P. s., et al. (1987), supra]. 
Nevertheless, it was reported that one cDNA library 
constructed by this method, starting from 5 ng of 
mRNA, contained about 2 x io* clones with 8 of io 
having cDNA inserts (i.e., the reported cloning 
efficiency was about 3 x io r recombinants per ng of 
poly (A) + RNA) . 

Directional cloning in other A phage 
vectors has also been reported [Palazzolo, M. J. and 
Meyerowitz, E. M. (1987) Gene 52, 197-206]. [These 
vectors are known as ASWAJ or AGEM, certain variants 
of which (LambdaGEM"^ and LambdaGEM n, 4 ) are 
commercially available from Promega Corporation of 
Madison, Wisconsin. The AGEM type of vectors are 
also examples of a composite vector comprising both 
a A phage genome and an embedded plasmid (GEM) ] . 
The directional cloning scheme in these A vectors 
utilizes two different restriction enzyme cleavage 
sites at the site for insertion of DNA. Thus, for 
example, to attach the end of a cDNA corresponding 
to the poly (A) end of the mRNA to a particular end 
of the cleaved vector DNA that has a sticky end for 
30 the restriction enzyme Sad, a synthetic DNA 

••linker-primer" segment is used which combines a 
single-stranded oligo(dT) primer with a restriction 
site for the enzyme Sad. After second strand 
synthesis, a linker segment with the site of a 
second restriction enzyme is ligated to the other 
end of the cDNA, which is then restricted with both 
enzymes of the insertion site of the vector, 
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according to much tlafee sam strategy as described for 
th previous exeuople of a directional X phage 
vector. ; 

This particular approach for directional 
cloning in a X vector, however, cannot be used to 
obtain full-length cDNAs of certain mRNAs because it 
requires cleavage of the cDNA molecules by the 
restriction enzyme Sacl and a second enzyme (e.g., 
Xbal) without first protecting the internal sites 
for these enzymes by appropriate methylation. [In 
an alternative version of the scheme reported by 
Palazzolo arid Meyerowitz, supra, the Xbal enzyme was 
replaced by EcoRI and the cDNA was methylated to 
protect against only this one enzyme.] Sites for 
these particular enzymes occur frequently by chance 
in natural nucleotide sequences. Thus, restriction 
of cDNAs with enzymes like these, as taught in this 
approach, causes truncation of cDNA inserts with 
internal Sad (and/ or Xbal) sites. In relation to 
cloning efficiency, it may be noted that this 
publication described a single cDNA library 
constructed by this method, starting from 1 jig of 
poly (A) 4 " RNA7 that contained about 1.6 x io 6 clones 
with cDNA inserts. 

In addition to the publications on 
directional £loning systems described above, there 
is a report which describes a non-directional 
plasmid-based system that uses an efficient 
oligonucleotide-based strategy to promote cDNA 
insertion into the vector [Aruffo, A. and Seed, B. 
(1987) Proc. Nat. Acad. Sci. USA 84, 8573-8577]. 
This method uses synthetic DNA adaptors that encode 
a recognition site for a particular restriction 
enzyme, BstXX, which has a variable recognition 
sequence, as illustrated below: 
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5 ' -CCANNNNNNTGG-3 * 
3 '-GGTNNNNNNACC-5 • 

where A, T, G and C indicate nucleotides having the 
DNA bas s adenine, thymine, guanin , and cytosine, 
respectively (for which the pairs A:T and G:C are 
complementary) , and N and N represent bases that are 
included within the recognition site sequence but 
that can be any of the usual DNA bases, provided 
only, of course, that each N and the corresponding N 
on the opposite DNA strand be complementary. The 
arrows (i and t) indicate the cleavage sites on the 
upper and lower DNA strands, respectively. 
Accordingly, cleavage of the SstXI site creates a 
4 -base single-stranded extension (sticky end) on the 
3 • end that varies from site to site. 

The report above discloses a plasmid 
vector with a site for insertion of DNA segments in 
which two identical BstXI sites were placed in 
inverted orientation with respect to each other and 
were separated by a short replaceable segment of 
DNA. Inversion of a DNA sequence consists of 
representing the base sequence of each strand, 
conventionally expressed in the 5* to 3' direction 
of the polynucleotide backbone, in a DNA strand with 
the same base sequence presented in the 3 • to 5' 
direction (e.g, inversion of the DNA sequence 5»- 
ACTG-3* produces the DNA sequence S'-ACTG-S 1 or, in 
the conventional 5* to 3' format, S'-GTCA-S 1 . 

With the particular BstXI recognition 
sequence that was employed in this vector, the 
4-base single-stranded ends of the inverted sites 
created on the two ends of the vector DNA by 
restriction with the BstXI enzyme were not able to 
anneal with one another. This situation is 
illustrated below, where two identical sites, one 
inverted relative to the other and separated by an 
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unspecified sequ nc (N...N), ar sh wn; th sticky 
ends of the vector produced by cleavag with the 
BstXl enzyme ar shown in bold print: 

5 • - (vector) CCANTGTGNTGG (N^N) CCANCACANTGG (vector ) -3 ■ 
5 3 • - (vector) GGTNACACNACC (N„N) GGTNCTCTNACC (vector) -5 ' 

(Note that the reference does not specify the 

entire BstXl recognition sequence that was used; 

only the sequence of the sticky end is clearly 

defined, as indicated below by inclusion of the N 

10 symbol where necessary) • 

Inspection of these single-stranded end 
sequences on this plasmid vector reveals that 
they are identical, due to the inversion of one 
of the sites relative to the other. Thus, the 

15 ends of the vector with inverted and non- inverted 
copies of this particular BstXl restriction site 
sequence cannot anneal with each other. 
Similarly, the restricted ends of the spacer DNA 
segment between these two sites will be 

20 identical. Accordingly, to clone cDNA segments 
in this vector, a synthetic adaptor was attached 
to each end at the double-stranded stage, by 
blunt end ligation, giving them the same termini 
as the replaceable segment that was removed from 

25 the vector with BstXl. The specific adaptor used 

in the above report comprises the following 

oligonucleotide sequences: 

5 i - CTTT AG AG CACA- 3 » 
3 1 -GAAATCTC- 5 ■ . 

30 Obviously, addition of this single adaptor to 

both ends of the cDNA segments would provide 

those segments with ends (in bold type) that 

could anneal and subsequently ligate efficiently 

to both identical vector ends. 

■ c r 

35 Thus, Aruffo and Seed, 1987, supra, 

discloses a method using this particular Bstxx 
recognition site sequence , whereby neither the 



WO 91/02077 ^ PCT/US90/04239 

cDNA (with attached adapt rs) n r th is lated 
vect r DNA (after being freed from the 
replaceabl segment after cleavage with SstXI) 
was able t ligate to itself . This work, 
5 however, neither teaches nor suggests general 

requirements for a BstXl recognition sequence, or 
for those of other restriction enzymes, to be 
usable in this cloning approach. 

Further, as these workers pointed out, 
10 their strategy did not provide a directional 

cloning capability. After first alleging that 
such directional capability was not needed, they 
admitted that, nonetheless, they had devoted 
considerable unsuccessful efforts to developing 
15 an alternative means of producing mRNA from every 
cDNA clone, namely a bidirectional transcription 
capability whereby both strands of an inserted 
cDNA would be transcribed. They concluded that 
this goal cannot be easily attained, at least not 
in their cloning host system. The authors 
stated, moreover, that they could obtain cloning 
efficiencies with their plasmid that were between 
0,5 and 2 x 10* recombinants per /xg of mRNA, which 
were said to compare favorably with those 
25 described for certain cloning systems based on 
phage A. In the only example of a cDNA library 
described in this reference, however, the yield 
of cDNA clones obtained by this method was 
actually stated to be only * 3 x io s recombinants 
30 from 0.8 jig poly (A) -containing RNA (i.e., less 
than 0.4 x 10* recombinants per /ig poly (A) - 
containing RNA) . 

Thus, there has been a continuing need 
for methods and vectors which would provide a 
35 higher yield of cDNA clones from limited amounts 
of eukaryotic mRNAs while also providing an 
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impr ved means of directing rientati n £ 
inserted cDNA fragments within vector DNAs. 



SUMMARY OF THE INVENTION 

The present invention contemplates the 
5 application of methods of recombinant DNA 
technology to fulfill the above needs for 
increased efficiencies in DNA cloning systems 
and, in particular, to develop new means for 
directional insertion of cDNA fragments into 

10 cloning vectors. 

More specifically, it is an object of 
the present -invention to provide means for 
directing assembly of insert DNAs into vector 
DNAs to form a unique, predetermined recombinant 

15 structure having the desired number and 

orientation of each Heeded DNA fragment, so that 
the number of resulting clones containing single 
inserts, as well as the probability of obtaining 
a full-length clone from each mRNA molecule, are 

20 enhanced. 

Further, it is an object of this 
invention to provide a cDNA cloning system which 
combines the features of this highly efficient 
cloning strategy with advantageous features of A 

25 phage vectors to overcome limitations of the 
presently available A cloning systems. 

Accordingly, the present invention 
relates to highly efficient means for inserting 
DNA segments into cloning vectors in a defined 

30 orientation, and a method for using such means 
that is referred to herein as the "automatic 
directional cloning (ADC)" method. Novel DNA 
vectors and DNA segments are also included. 
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Appr ciation of the per at ion and 
advantages f this invention reguir s further 
analysis f the problems in the prior approaches. 
The understanding of th se problems by th 
present invent rs lead to development of this 
invention. 

The present invention has been developed 
in light of recognition by these inventors of 
major sources of the limitations on cloning 
efficiency with the present systems designed for 
directional cloning, that they all employ 
restriction enzymes with recognition site 
sequences having one or both of the following 
disadvantages: they are either too short or they 
have a particular type of symmetry called "dyad" 
symmetry. 

As noted above, present A phage vectors 
for directional cloning of cDNAs suffer 
inefficiencies due in part to their use of 
restriction enzymes with recognition sequences 
that occur frequently in natural DNA sequences. 
Some problems relating to this issue might be 
solved by choosing a restriction enzyme with an 
infrequently occurring site (i.e., a longer 
recognition sequence which, by chance, would 
occur less frequently in random natural DNA 
sequences) . 

However, even when modified to utilize 
an infrequently cutting restriction enzyme, the 
present implementations of directional cloning in 
X have a drawback that is common to any cloning 
scheme using restriction enzymes with recognition 
sequences having dyad symmetry of the sticky ends 
produced by cleavage with the enzyme. 

Typical restriction enzymes with 
recognition site sequences having dyad symmetry 
make staggered cuts in the two opposing DNA 



strands at symmetrical points surr unding the 
center f a dyad pattern. Cleavag by this type 
of enzyme produc s sh rt single-stranded ends 
which are complem ntary in base sequence t th se 
of any othcir DNA fragment produced by cleavage 
with the same enzyme* 

Fpr example, the recognition site of the 
commonly used restriction enzyme, ZcoRI, consists 
of the following complementary sequences which 
when cleaved by the enzyme, produce the 4 -base 
extension of the 5 1 end of the DNA containing the 
dyad n TTAA° .(shown in bold face type) : 

5 * — GAATTC— 3 ■ 
3*-CTTAAG-5 t 

, t 

where A, T, G and C indicate DNA bases, as 
described above for the BstXI site* Inspection 
of this .EcoRI sticky end sequence readily reveals 
that inversion of this sequence produces its 
complement/ namely "AATT" . Thus, any DNA end 
produced by EcoRl can anneal to any other such 
DNA end; and, therefore, any EcoRX sticky end can 
also be ligated efficiently to any other such 
end. similarly, all DNA ends which are produced 
by any one restriction enzyme that generates 
sticky ends characterized by dyad symmetry are in 
the case of each sticky end sequence readily 
liga table que to another. [Hereinafter, such 
"self-ligatable" single-stranded ends of DNA that 
are producible by a restriction enzyme will be 
simply referred to as "symmetrical ends", and the 
enzymes that produce them, as "symmetrical 
restriction enzymes".] 

In light of this symmetrical nature of 
many restriction enzyme recognition sites, one of 
the major problems with existing directional A 
vectors caii be more fully appreciated. When the 
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two end fragments of cleaved A DNA (i.e., the so- 
called A "arms") are ligated with cDNA fragments, 
several pr ducts are produced, only some of which 
c nstitute the desired infecti us DNA molecules 
containing cDNA inserts. For instance, consider 
the simplest case, when the ends on both the cDNA 
and on each of the "left" and "right" A arms (as 
the two A DNA arms have been designated in a 
genetic mapping convention) have been cut by the 
same symmetrical restriction enzyme. Here, 
linear structures other than those with the 
desired order (i.e., "left arm-cDNA insert-right 
arm") may form in significant amounts during 
ligation with cDNA fragments; and the cDNAs 
trapped in these other, nonviable structures 
cannot produce phage clones. These undesirable 
ligation byproducts may include self-ligation 
products of the two ends of individual vector or 
cDNA segments, consisting of circular DNAs. 
Ligation products in this instance may also 
comprise vector-cDNA combinations containing 
multiple inserts, which, even if viable, may 
create problems in expression or identification 
of original mRNA structure. 

On the other hand, when each end of the 
vector and insert cDNA molecule are ultimately 
produced by two different symmetrical restriction 
enzymes, as in the present directional A systems, 
these ends are then physically distinguishable in 
relation to the polarity of the encoded genetic 
information in each DNA segment, i.e., the 
matching of complementary sticky ends on vector 
DNAs and cDNAs results in the desired directional 
cloning of the cDNA insert relative to functional 
sequences in the vector (e.g. , a promoter) . 
Further, circularization due to self-ligation of 
cDNAs or vectors without inserts is eliminated by 
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th use of two different symmetrical restriction 
enzymes. 

Other undesirable ligation byproducts 
remain, however, in the usual two enzym approach 
5 for directional cloning using symmetrical 

enzymes. Some of these are dimers of vector or 
cDNAs, which may be designated, for example, as 
"tail-to-tail" or ^head-to-head" dimers. Thus, 
even when vector and cDNAs are made by cutting 

10 with two different symmetrical enzymes, head-to- 
head and tail-to-tail dimers are not eliminated, 
although the population of desired molecules is 
significantly higher. 

In contrast to existing systems based on 

15 A phage, the^ automatic directional cloning method 
does not permit cDNA or vector f ragments to 
ligate to each other , ensuring the presence of a 
single insert in each clone, as well as higher 
cloning efficiencies and lower backgrounds of 

20 clones that do not contain cDNA inserts. 

To accomplish these goals, the present 
invention contemplates use of restriction enzymes 
which produce single-stranded ends that do not 
exhibit dyad symmetry (hereinafter referred to as 

25 "non-symmetrical ends" and correspondingly, non- 
symmetrical recognition site sequences and 
nzymes) • Although certain preferred embodiments 
6f the present invention employ derivatives of 
bacteriophage X as the vector, which further 

30 comprise embedded plasmid genomes, this invention 
can be practiced with any self-replicating DNA 
molecule (i.e., a "replicon") serving as the 
vector for DNA cloning in any host in which the 
selected replicon can be replicated. 

35 Work cited in the Background above 

describes a plasmid-based system that 
advantageously employs two identical BstXl 
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recognition site seguenc s, alb it in two 
different orientati ns. This single recognition 
seguenc is non-symm trical according t the 
definiti n in the present disclosure, alth ugh 
the r ference does not describ the BstXl 
sequence in such terms or otherwise characterize 
this sequence as such. The present invention is 
clearly distinguishable from this previous 
approach, as described below. 

Use of BstXl sites is not readily 
applicable to the A system, due to the existence 
of multiple BstXl recognition sites in the A 
phage genome, owing to the number of. base pairs 
in the variable recognition sequence that are not 
allowed to vary (i.e., the "invariable base 
pairs" being only six) . 

Accordingly, in one aspect the present 
invention relates to a genetic cloning vector 
comprising at least one replicon, and a site for 
inserting DNA segments to be cloned that includes 
at least two non-symmetrical restriction enzyme 
recognition sequences that are identical, where 
the first of these identical recognition 
sequences is in the inverted orientation with 
respect to a second identical sequence; and, in 
addition, the identical restriction enzyme 
recognition sequences include greater than six 
positions having invariable DNA base pairs. 
Recognition site sequences of the enzyme Sfll, 
for example, fulfill both the length and 
asymmetry requirements for this aspect of the 
invention, as will become evident below. 

On the other hand, in plasmid systems or 
other replicons lacking BstXl sites, either 
naturally or due to genetic engineering, two 
BstXl site sequences that are in the same 
instance non-symmetrical and nonldentlcal can be 
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advantage usly empl yed for efficient: cloning of 
DNA segments according t the present invention. 

More generally, this aspect f this 
invention may be practiced with any two n n- 
symmetrical r striction enzyme recognition 
sequences that are not identical (recognitions 
site sequences of the enzyme Sfll, for instance) . 

Accordingly, the present invention also 
relates to a genetic cloning vector comprising at 
least one repiicon, and a site for inserting DNA 
segments to be cloned that includes at least two 
non-symmetrical restriction enzyme recognition 
sequences that are nonidentical. In this aspect 
of the invention, two of the non-symmetrical 
restriction enzyme recognition sequences can be 
selected advantageously to be cleavable by a 
single restriction enzyme, for example, BstXX or, 
alternatively, Sfil; or each of two nonidentical 
restriction enzyme recognition site sequences may 
be selected to be cleavable by a different 
enzyme. Preferably, at least one of the non- 
symmetrical recognition sequences includes 
greater than six positions having invariable DNA 
base pairs; and most preferably, two nonidentical 
recognition sequences include greater than six 
positions having invariable DNA base pairs, as 
typified by Sfil recognition site sequences. 

The present invention further relates to 
a vector, as described above, in which the 
repiicon comprises a form of bacteriophage A. 

The vector may advantageously further 
comprise regulatory elements located in relation 
to the site for insertion of DNA segments such 
that, when a DNA segment is inserted into this 
site, at least a portion of the sequences of the 
DNA segment is transcribed. This portion may be 
derived from either one of the strands of the 
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inserted d uble-stranded DNA segment, r fr m 
b th of these strands. 

in one major emb diment f this aspect 
of this invention, these regulatory elements in 
the vector consist of prom ters that entirely 
originate from bacteriophage. By the phrase 
"originate from" it is meant that the regulatory 
element (e.g., promoter) is encoded in the genome 
of the instant organism or virus (e.g., 
bacteriophage) as it occurs in nature. It should 
be noted here that it is well known that, 
generally, promoters that originate from 
bacteriophage are not able to initiate 
transcription in eukaryotic hosts. This 
particular embodiment of the present invention is 
exemplified by two A-plasmid composite vectors, 
LambdaGEM™ll and LambdaGEM~12 , which are 
commercially available from Promega Corporation 
of Madison, Wisconsin. 

According to available information at 
the time of the present disclosure, these 
particular LambdaGEM w vectors apparently were 
first disclosed in the 1988/1989 Catalogue and 
Applications Guide for Biological Research 
Products published and distributed by Promega 
Corporation in August of 1988, the entirety of 
which is hereby incorporated herein by reference. 
The following excerpts of that catalog describe 
these particular vectors and some of their 
various uses, particularly those relating to 
transcription from bacteriophage promoters. 

Section 11. pace 5; 

The LambdaGEM-11 vector is a multi- 
functional genomic cloning vector designed for 
high resolution mapping of recombinant inserts, 
simplified genomic library construction, ultra- 
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low background of non^reconbinants, and rapid 
g n mic walking. This lambda replacement-typ 
cl ning vehlcl contains the following featur s 
(Figur 2 [not shown]): dual opposed 
5 bacteriophage T7 and SP6 RNA polymerase 

promoters, flanking asymmetric SfiZ restriction 
sites, and a multiple cloning site with 
strategically positioned Xhol and BamHI 
restriction sites. The LambdaGEM-11 vector also 

10 contains unique sites for Sad, Avrll, .EcoRI, and 
Xbal. Because it is a derivative of EMBL3 (1) , 
DNA fragments ranging from 9-2 3 kb can be cloned 
in the LambdaGEM-11 vector and the Spi phenotypic 
selection against non-recombinants is available. 

15 The vector was designed to make use of the Sfll 
recognition sites flanking the promoters for the 
high resolution restriction mapping of insert DNA 
using the Sfl linker mapping system (Sec. 11, 
pg. 8) . 

20 The T7/SP6 phage promoters simplify 

chromosomal ' "walking" , as RNA probes synthesized 
from the extremities of the cloned insert can be 
used to search a library for overlapping 
sequences in either direction. In addition, the 

25 nucleotide sequence of the end of an insert 

cloned in the LambdaGEM-11 vector can be obtained 
directly form the phage template by hybridizing 
an SP6 or T7 oligonucleotide primer, followed by 
a chain tessii nation sequencing reaction (2,3). 

30 Two cloning strategies for genomic 

library construction, using DNA partially 
digested with Wbol or Sau3AI, are available with 
the LambdaGEM-11 dephdsphorylated BamHI arms. A 
new cloning strategy (4) relies on the exclusive 

35 specificity with which partially f illed-in Xhol 
LambdaGEM-11 arms (Xhol half-site arms) can be 
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c mbined with partially f illed-in Sau3AI digested 
gen mic ONA. The only ligation products possible 
are single copies f genomic ins rts with 
appr priate arms, since the partial fill-in 
pr vents self -ligation r actions of vector arms, 
central stuffer, and genomic fragments. This 
method also makes genomic ONA fractionation 
unnecessary, is very rapid (Figure' 3 [not 
shown] ) , and requires small amounts of starting 
material. The Xhol and BamRI sites in the 
LambdaGEM-li vector are strategically positioned 
6 and 11 bases, respectively, from the 
transcription initiation site of either promoter. 

As measured by in vitro packaging, 
recombinant efficiencies of 3 x io' pfu/^g DNA 
have been achieved using * test insert ligated to 
LambdaGEM-ll dephosphorylated BamHI or Xhol half- 
site arms using Packagene* lambda packaging 
extracts. The background for self -ligated arms 
alone is typically <100 pfu/jug DNA in either 
case. This ultra-low background level of non- 
recombinant vector DNA has three important 
advantages: it eliminates the need for the Spi 
genetic selection against the parental vector, 
which is known to result in biased libraries (5), 
non-productive ligation events are minimal, 
thereby resulting in larger genomic libraries, 
and fewer filters need be processed for screening 
a library. For detailed protocols describing the 
use of this vector, see Sec. 11, pg. 12. 

Section 11. p a qa 

The LambdaGEM-12 vector is a multi- 
functional genomic cloning vector designed for 
high resolution restriction mapping of 
recombinant inserts, simplified genomic library 



construct £ h,/ ultra- 1 w background of non- 
recombinants , and rapid genomic walking* This 
lambda r placement-type cl ning vehicle contains 
the following features (Figure 4 [not shown]): 
dual opposed bacteriophage T7 and SP6 RNA 
polymerase promoters, RNA polymerase promoters 
[sic], flanking asymmetric Sfil restriction 
sites, and a multiple cloning site with 
strategically positioned NotX and BamHI 
restriction sites. The LambdaGEM-12 vector also 
contains unique sites for sacl, EcoRI, Xhol, and 
Xbal. Because it is a derivative of EMBL3(1), 
DNA fragments ranging from 9-23kb can be cloned 
in the LambdaGEM-12 vector and the Spi phenotypic 
selection against non-recombinants is available. 

Accordingly, the present invention 
relates to a vector that is either LambdaGEM 11 
or LambdaGEM 12. Further details of the use of 
these vectors for restriction mapping of inserted 
DNA segments, according to the Sfl Linker Happing 
System mentioned above, are extracted from the 
Pr omega catalog below. 

Section 11. pace 8; 

The multi-functional LambdaGEM -11 and 
LambdaGEM -12 genomic cloning vectors have been 
engineered specifically for high resolution 
restriction mapping of recombinant inserts. The 
vectors, derivatives of EMBL3 , possess Sfll 
restriction sites flanking bacteriophage T7/SP6 
RNA polymerase promoters and a multiple cloning 
region (Sec. 11, pg. 5) . The flanking Sfll 
restriction sites allow most inserts to be 
excised as a single fragment, since this 8 -base 
recognition sequence occurs infrequently in 
genomic DNA (in theory, once every 65,536bp). 
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Sf±X recognizes the interrupted 
palindr me GGCCNNNN/NGGCC and cleaves within the 
central unspecified seguenc , leaving a 3 -base 3 1 
overhang. The nucleotide sequence of the central 
region which b comes the overhanging t rmini thus 
may contain any of the four possible bases. The 
flanking Sfll sites in the vectors have been 
designed in an asymmetric fashion, so that the 
site on the left is distinct from the site on the 
right. Therefore, radiolabeled linkers 
complementary to either the left or right Sfll 
termini can be ligated separately to the Sfil 
excised genomic DNA. Once the insert has been 
asymmetrically labeled, a high resolution 
restriction map can be determined by partial 
digestion with a frequent cutting restriction 
endonuclease such as Sau3AX followed by gel 
electrophoresis and autoradiography (Figure 6 
[not shown] ) . 

The mapping resolution of this method is 
an order of magnitude greater than conventional 
cos site oligo labeling, since only the ends of 
the centrally located insert are labeled instead 
of the ends of the 2 0kb and 9kb arms of the 
vector. The variable results generated from 
inaccurate size estimates of large restriction 
size fragments, as well as anomalous bands which 
result from the fusion of the insert with a 
vector fragment, are eliminated with this system. 
For a detailed protocol describing the use of 
this system, see Sec. li, pg. 14. 

Still further, the present invention 
relates to a vector having nonidentical non- 
symmetrical restriction enzyme recognition site 
sequences, as described above, also including 
regulatory elements located such that the 
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sequenc s of an inserted DNA segment: ar 
transcribed/ as in the LambdaGEM™ vectors above, 
but where "the regulatory elements are at least 
partly of eukaryotic origin. A principal 
5 emb dim nt of this aspect of the present 
invention is exemplified by two A-plasmid 
composite vectors, ApCEVIS and ApCEV9, the 
structures ^bf which are depicted in Figure 1 and 
described further below. 

10 In cloning operations with these two 

vectors, a DNA segment, a cDNA, for example, is 
cloned between two Sfll sites, A and B, as 
described in the section below relating to the 
automatic directional cloning method. The 

15 vectors are designed as eukaryotic expression 
vectors, utilizing the M-MLV LTR promoter, and 
they contain the SV4 0 early promoter-driven neo 
gene as a selectable marker. 

Thus, the present invention further 

20 relates to^a genetic cloning vector comprising at 
least one replicon, and a site for inserting DNA 
segments to be cloned that includes at least two 
nonidentical restriction enzyme recognition 
sequences that- are non-symmetrical, where the 

25 vector also includes a selectable marker that is 
functional in eukaryotic cells in which the 
vector can be -replicated. The term "functional" 
as used here means that the gene for the marker 
is expressed and that a selection scheme for that 

30 marker is operable in these eukaryotic cells. 

In these two particular exemplary 
vectors, the form of A-plasmid composite vector 
was chosen to take advantage of the efficient 
packaging and high density screening in A 

35 systems, a&d simpler DNA preparation and analysis 
in plasmid- systems. After isolation of clones of 
interest, pCEV plasmids with cDNA inserts can be 
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btained by Notl digestion of crude A DNA 
preparations and ligation £ 1 lowed by 
transformati n f bacterial c lis. Th A 
g n type which supports healthy growth (red + , 
gram*) was ch sen to maintain the intactness of 
inserts during the amplification step of the 
libraries. Deletion and/or insertion derivatives 
generated during the amplification step, if any, 
would not accumulate in the population, since 
they would not have a growth advantage. 

ApCEVlS has several advantages over 
ApCEV9 as follows: (1) ApCEV15 does not require 
the supF mutation in host cell due to the S + 
allele in the A genome. (2) ApCEVlS does not 
lysogenize host strains due to the deletion of 
the cJ gene. (3) cDNA inserts in ApCEVlS can be 
cut out by Sail digestion. (4) ApCEV9 loses the 
functional SV40 promoter after the cDNA insert is 
cloned, while ApCEVlS does not. (5) ApCEVlS can 
accommodate longer cDNA inserts (up to 10.5 Jcb) 
than ApCEV9 (up to 8.5 kb). (6) ApCEVlS DNA 
contains a unique ffindlll site. 

ApCEV9 has at least two advantages over 
ApCEVlS. The ApCEV9 genome has a stuff er fragment 
between the two Sfil sites, which would be 
replaced by cDNA inserts during cloning process. 
Generally, it has been found that ApCEV9 cDNA 
libraries have lower backgrounds than ApCEVlS 
libraries, presumably because the presence of 
the stuff er fragment in ApCEV9 separates the two 
Sfil sites enough to ensure complete Sfil 
cleavage. It has also been observed that ApCEV9 
grows more stably without accumulation of 
fast-growing derivatives. This is likely due to 
the longer size of the genome compared to that of 
ApCEVlS. 



in an ther aspect, the present invention 
relates to a meth d for cl ning £ DNA segments 
r f erred to as th automatic directional cloning 
method. In particular, the pr sent invention 
relates to a method for cloning a cDNA copy of a 
eukaryotic mRNA, comprising the following steps 
(which are further illustrated in Figure 3 and in 
the Description of Specific Embodiments, below) : 

(i) annealing a linker-primer DNA 
segment comprising a single* stranded 
oligonucleotide which has oligo(dT) at the 3 • 
end, and a single-stranded extension at the 5' 
end that is included in a first non-symmetrical 
restriction enzyme recognition sequence. 
[Note that this first recognition sequence is 
identical to one of two non-symmetrical sites in 
the vector that are used for direction cDNA 
cloning. ] 

^ (ii) enzymatically synthesizing the 
first strand of the cDNA from the linker-primer 
that is annealed with the mRNA molecule; 
[Typically, this may be accomplished using a 
reverse transcriptase. During the first strand 
synthesis reactions, the single-stranded 
linker-primer is repaired so as to be double- 
stranded. Thus, the single-stranded extension 
referred td: in this method may be present as such 
in the linker-primer, or it may be produced from 
a double-stranded region of linker-primer by 
cleavage wi/th a restriction enzyme following 
ligation of the linker-primer to the cDNA. ] 

(iii) enzymatically synthesizing the 
second strand of the cDNA using the first strand 
as the template under conditions such that 
single-stranded extensions on the synthesized 
cDNA molecule are made double-stranded; 
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[Typically, the sec nd strand is synthesized by 
DNA p lymerase I from the nicks on the RNA moiety 
introduced by RNase H associated with the reverse 
transcriptase. The linker-primer is converted to 
the double-stranded form in the first or second 
strand synthesis step. T4 DNA polymerase 
treatment makes double-stranded any 
single-stranded extensions remaining on the 
synthesized cDNA molecule.] 

(iv) ligating onto the blunt-ended cDNA 
resulting from synthesizing the second strand, an 
adaptor DNA segment comprising a second non- 
symmetrical restriction enzyme recognition 
sequence that is nonidentical to the first non- 
symmetrical restriction enzyme recognition 
sequence; [in the case of a principal embodiment 
of this aspect of the invention, ligation of the 
adaptor ligation directly adds one single- 
stranded extension to the cDNA molecule. 
Alternatively, however, this extension could be 
exposed by cleavage of the recognition site on a 
double-stranded portion of the adaptor after 
ligation to the cDNA. ] 

(v) exposing the cDNA resulting from 
ligation with the adaptor to one or more 
restriction enzymes that can cleave the first and 
second non-symmetrical restriction enzyme 
recognition sequences under conditions such that 
both of these sequences are cleaved, resulting in 
the vector DNA having two single-stranded ends 
that are not complementary; 

[This restriction causes exposure of at least one 
of the single-stranded extensions needed on the 
cDNA by cleavage of the recognition site on the 
repaired linker-primer portion of the cDNA 
molecule. If the non-symmetrical site in the 
adaptor is also uncleaved at this point, it may 
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also be rest-ricted at this step. In a principal 
embodiment/ a single enzyme can cleave the non- 
symmetrical sites on both the linker-primer and 
the adapt r; but in oth r embodiments, two 
different enzymes may be required.] 

(vi) ligating the cDNA resulting from 
cleavage with the enzymes to DNA of a genetic 
cloning vector, where the vector comprises 

at least one replicon; and 

a site for inserting DNA segments to be 
cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences, 

and where in the vector DNA, at least 
two non-symmetrical restriction enzyme 
recognition sequences have been cleaved by one 
or more enzymes that can cleave those recognition 
sequences, Resulting in vector DNA having two 
single -stranded ends that are not complementary; 
wherein further, 

[Thus, the two ends of the cleaved vector DNA 
cannot anneal and be ligated together. Cleavage 
of the vector DNA at both non-symmetrical sites 
usually releases a short DNA segment from between 
the, the "stuff er" ; for the highest yield of 
clones containing cDNA inserts, this stuff er is 
removed from the cleaved vector DNA prior to 
ligation of the vector with cDNA. ] 

• one of the single-stranded ends on the 
cleaved vector DNA has a sequence that is 
complementary to the single-stranded extension on 
the linker-primer attached to the cDNA; and 

the other single-stranded end on the 
cleaved vector PNA has a sequence that is 
complementary to the single-stranded extension on 
the adaptor attached to the cDNA; and 
[Thus the cDNA cannot circularize and is attached 
to the vector in a specific direction.] 
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(vii) transforming a suitable host cell 
with the recombinant DNA segment comprising the 
cDMA and the vect r DNA that results from the 
ligati n of cDNA to vector DNA; and 

(Vari us genetic transformation methods known in 
the art may be used. In a principal embodiment, 
the vector is a form of bacteriophage X and, 
therefore, the recombinant DNA containing cDNA 
inserts is packaged in vitro into phage particles 
which are then used to infect a bacterial host 
cell. Alternatively, for example, CaCl, 
precipitation of DNA may be used to transform 
host cells, especially mammalian cells.] 

(viii) identifying a clone of host 
cells, resulting from transformation with said 
recombinant DNA, that contains a recombinant DNA 
segment including said cDNA. 

[Various strategies well known in the art of 
genetic engineering may be used to identify a 
clone of the desired cDNA, including 
hybridization with nucleic acid probes, 
immunological detection of expressed antigens, 
and assays for functional products, to name but a 
few. ] 

The strategy underlying this cDNA 
cloning method of the present invention is based 
on the following theory, explained in terms of 
particular examples of a principal embodiment, * 
which is presented to aid in understanding the 
method and does not in any way limit the scope of 
the invention as defined by the appended claims. 

When vector and insert DNA fragments are 
mixed and ligated in a typical cloning 
experiment, several molecules are produced in 
addition to those desired. These include 
self -ligation products of vectors or inserts, 
head-to-tail or head-to-head dimers of vector or 
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insert, and vect r DNA c ntaining multiple 
inserts. Formation of these molecul s would 
reduce th cloning efficiency. Even when vector 
and insert DNAs are made by cutting with two 
different enzymes, formation of ligation products 
such as head-to-head dimers can not be 
eliminated, although the population of desired 
molecules is significantly higher and insertion 
occurs in a defined orientation* 

The reason why these self-ligation 
products and dimers are made, as noted above, is 
that majority of restriction enzymes in common 
usage recognize sequences of dyad symmetry. The 
two sticky ends <S + and S~) created with an 
enzyme contain the same single-stranded 
extensions, and all combinations of the ends 
including S* and S + can be ligated. However, 
certain restriction enzymes cleave the 
non-symmetrical site (A) , yielding two different 
sticky ends (A + and A") . In this case, only A + 
and A~ ends can be ligated (see Fig. 2). When a 
vector DNA containing two different sites (A and 
B) with this feature is cleaved by restriction 
enzymes of this kind, the stuff er fragment hemmed 
by the sites is removed, and ligation is 
performed with inserts having sticky ends 
complementary to those of the vector, 
theoretically all of the clones obtained contain 
single inserts in the defined orientation. 

In a principal embodiment of this aspect 
of the invention, the restriction enzyme SflX was 
chosen to jjleave both the A and B sites, because 
Sfll is an infrequent cutter and leaves a 
non-symmetrical 3 1 extension of three nucleotides 
(Fig. 2A) . Since the central 5 bases in the 
recognition site can be any sequence, two S^il 
sites (A and B) were designed and introduced into 
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the vectors (Figs. 1 and 2B) . The cDNA fragments 
t be inserted into the vectors were oriented by 
the use of oligo(dT) primers having attached the 
sequence f the Sf 11(B) site. 

The steps for cDNA synthesis are 
schematically shown in Fig. 3. During the first 
strand synthesis reactions, the single-stranded 
linker-primer is repaired so as to be double- 
stranded. After cDNA molecules are blunt-ended, 
an adaptor, designated Sfil adaptor, having the 
3« extension which fits to the Sfil (A + ) end, is 
ligated. After cleavage by sril, the resulting 
cDNA molecules have different 3 1 extensions which 
fit on the vector ends to achieve directional 
cloning (Fig. 2C) . 

Thus, regardless of the sequences of the 
three-base single-stranded sticky ends, inversion 
of one Sflx end sequence can never produce a 
self -complementary sequence. Accordingly, 
regardless of the sequence of the five arbitrary 
internal base pairs within any SflX cleavage 
site, the polarity of the complementarity of the 
sticky ends will always be maintained. Thus, 
such inherently non-symmetrical sticky ends, as 
well as .non-symmetrical variants of recognition 
sequences for which some forms can have dyad 
symmetry, as described above (e.g., BstXI) , are 
also useful for practicing the automatic 
directional cloning method of the present 
invention, for efficient ligation of DNA 
fragments in a predirected order. Screenings of 
cDNA libraries constructed by this method, as 
described below, demonstrated that cDNAs of up to 
6.4 kilobase pairs containing complete coding 
sequences could be isolated at high efficiency. 
Thus, this cloning system is particularly useful 
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for the is lation of cDNAs of relatively 1 ng 
transcripts' present even at low abundance in 
c lis. 

The present invention further relates to 
5 DNA of a genetic., cloning vector comprising at 

least one repllcon; and a site for inserting DNA 
segments to be cloned that includes at least two 
nonidentical restriction enzyme recognition 
sequences that are non-symmetrical, in which the 

10 non-symmetrical restriction enzyme recognition 
sequenced have been cleaved by one or more 
enzymes that can cleave them, so that the DNA is 
ready ± or use in cloning DNA segments having 
matching sticky ends. 

15 In a specific embodiment, exemplified 

below, the present invention relates to a cloning 
vector suitable for use in cloning DNA segments 
for cDNAs , by means of stable phenotypic changes 
induced by a specific DNA segments. An example 

20 of such vectors is ApCEV27 which differs 
(described above) as follows: 

(1) The M-MLV LTR fragment is replaced 
by one derived from pZIPneoSV(X) ; one skilled in 
the art will appreciate, however, that similar 

25 fragments a&n be used* 

(2) The bona fide promoter of the neo 
gene is Removed to fuse the SV40 promoter 
directly to^the neo structural gene- This 
modification eliminates ATG codens upstream from 

30 the translation initiation site of the neo gene, 
thereby increasing expression of the neo gene in 
mammalian cells, one skilled in the art will 
appreciate that the neo gene can still be 
expressed from the trp-lac fused promoter in 

35 bacteria (i.e., £• coli) . 

(3) The second selectable marker in 
bacterial cells, the ampicillin resistance gene 
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(amp) , is introduced to permit sel ct transformed 
bacterial (i-e., E. c li) cells resistant t b th 
ampicillin and kanamycin, thus av iding selection 

f truncated plasmid clones. One skilled in the 
art will appreciate that alternative markers can 
be used. 

(4) The sites for two additional 
infrequent cutters, Xhol and Mlul, were included 
along with the Notl site. Alternative infrequent 
cutters can also be used to effect the purpose of 
efficiently effecting plasmid rescue. 

(5) The multiple cloning site (MCS) 
contains the restriction sites for BainHI, Sail, 
Sfil(A), EcoRI, Bgfl, Hindlll, Sfi(B), Sail, and 
BstEll, in more convenient order. 

(6) The SP6-P and T7-P phage promoters 
were introduced to synthesize sense and anti- 
sense RNa of cDNA inserts, respectively. A 
Alternative promoters can also be used. 

(7) A phage origin was introduced to 
synthesize single-stranded DNA from the vector, 
in the case of ApCEV27, fl is used. 

(8) The rat preproinsulin gene 
polyadenylation signal is added for efficient 
expression of DNA inserts (alternative signals 
can be used to effect the desired end result) . 

(9) The replication origin of pUC19 is 
used to increase the copy number of pCEV27. The 
replication origin of pCEV9 and pCEVIS is derived 
from a short fragment of pBR322; since the ori 
sequence lacks the promoter for replication 
initiation, unstable replication results and thus 
lower copy number. Replication origins similar 
to pUC19 can also be used to increase copy 
number • 

Finally, the present invention also 
relates to a reagent kit comprising cleaved 
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vector DMA, .ready f r use in cl ning, as 
described above, and further including a linker- 
prim r having a single-strand d nd that is 
c mplem ntary to ne single-stranded end of th 
5 cleaved vector DNA; and an adaptor which after 

cleavage by a suitable restriction enzyme, has a 
single-stranded end that is complementary to the 
other -single-stranded end of the cleaved vector 
DNA. ' One skilled in the art of genetic 

10 engineering would appreciate that such a kit 
might advantageously also include appropriate 
quantities of enzymes, buffers and other reagents 
needed for the practice of the automatic 
directional cloning method according to the 

15 teachings of the present invention. 

The present invention may be understood 
more readily by reference to the following 
detailed description of specific embodiments and 
the Examples and Figures included therein* 

20 BRIEF DES CRIPTION O F THE DRAWINGS 

•v 

Fig. 1. structures of the vectors. 
Panel (A), ApCEV9 and panel (B) , ApCEVlS. Each 
vector contains a plasmid DNA within the X DNA. 
An expanded map of the plasmid portion is shown 

25 with derivation of the DNA segments and the 

location of several restriction sites including 
the multiple cloning site (MCS) . Arrows show the 
locations of the promoters and the direction of 
transcription. 

30 Fig. 2. Scheme of the automatic 

directional cloning system. Panel (A) , 
Nucleotide sequences of the SflT sites. The 
general structure of the Sfil site is shown at 
the top, wfctere the letter N denotes any 

35 nucleotide. The two SfiJ sites specific to the 
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vect rs, SttT(A) and Sfii(B), are shown under the 
general structure. The upper strands are shown 
in the 5» to 3« direction, while the lower ones 
ar in the opposite direction. The b ttom of th 
figure shows the sequenc s of the ends produced 
by the S£il cleavage of the general Sfil site. 
The left and right half sites are denoted as 

and5fii(~), respectively. Similarly, 
the sequences of sfii(A + ), srii(A~), sfii(B + ), 
and Sfil(B~) half sites can be derived from the 
sequences of the Sfii(A) and Sfil(B) sites. 
Panel (B) , preparation of the ApCEV vector arms. 
ApCEV vector DNA is shown at the top, where cosh 
and cosJ? represent the left and right cohesive 
ends of A, respectively. The locations of the 
Sfil(A) and Sfil(B) sites are shown as (A) and 
(B) , respectively. Following ligation to seal 
the cohesive ends (in the middle) , the DNA is 
cleaved by Sfil to expose the Sfll (A + ) and 
Sfll(B ) sticky ends of the vector molecules. 
The small stuffer fragment is removed to prepare 
the vector arms shown at the bottom. The 
sequences of the single-stranded extensions are 
shown at both the 3 '-ends of the cDNA and vector 
arms. Panel (C) , formation of the A concatemers 
containing cDNA inserts. The cDNA fragment shown 
at the left side are prepared to give the 
sriI(A") and Sfil(B + ) sticky ends to the molecule 
by the procedure described in Fig. 3. When the 
fragments are ligated with the prepared vector 
arms, alternating concatemers consisting of the 
cDNA inserts and vector arms in the defined 
orientation are produced automatically due to the 
base-pairing specificity as shown in the middle. 
In vitro packaging extracts cut out the DNA 
segments hemmed by the two cos sites from the 
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c ncatemer to f rm th active A phage particles 
as shown at the bottom. 

Fig. 3, Schematic view of the cDNA 
synthesis. An mRNA molecule is shown at the top 
5 with the cajif structure (m 7 Gppp) and the poly (A) 
stretch (AAAAA) at the 5«- and 3«- ends, 
respectively. The linker-primer is the 
single- stranded oligonucleotide which contains 
the oligo(dT) at the 3 1 half, and the SflT(B) 

10 site (shown by an asterisk) at the 5' half. The 
first strand is synthesized by Moloney murine 
leukemia virus* reverse transcriptase (M-MLV RT) 
from the linker-primer hybridized with the RNA 
molecule. The second strand is synthesized by 

15 DNA polymerase J[ from the nicks on the KNA moiety 
introduced by fiNase H. The linker-primer is 
converted to the double-stranded form in the 
first or second strand synthesis step. T4 DMA 
polymerase treatment makes double-stranded any 

20 single-stranded, extensions remaining on the 
synthesized cDNA molecule. The Sfil adaptor 
ligation adds one 3' single-stranded extension, 
the Sfil (A - ") sticky end, to the cDNA molecule. 
Another 3 'extension, the SfiI(B + ) sticky end, is 

25 exposed by Sfil cleavage of the Sfil(B) site on 
the repaired linker-primer portion of the cDNA 
molecule. 

Fig. 4. * Cloning of a model insert into 
pCEVIS using the ADC method. Panel (A) , 

30 restriction map of pCEV15-RAS. The plasmid was 
constructed by cloning a 0.7 kbp fragment 
containing the mouse H-ras (v-bas) coding 
sequence (Reddy et al., 1985) into the EcoRI site 
of pCEVIS . The open thick arc and closed thin 

35 arc represent the ff-ras insert and vector, 

respectively. Panel (B) , analysis of ligation 
products. pCEV15-JRAS DNA was digested with Sfil, 



37 

and vector (4.2 kb) as well as insert (0,7 kb) 
fragments w re purified fr m the gel. Similarly, 
ScoRI/Apal fragments wer prepared as c ntrols. 
The vect r and/or if-ras insert Sfil fragments 
(left half) or £coRl/ApaI fragments (right half) 
were incubated in kinase ligase buffer (see 
below) with or without T4 DNA ligase as 
indicated. The ligation products were analyzed 
by agarose gel electrophoresis. Sizes of the 
fragments are shown in kb. Panel (C) , HindUl 
digestion of pCEVIS and its derivatives. (lane 
a) pCEVIS; (lane b) pCEVl5-RAS ; (lane c) pCEVIS 
containing the H-ras insert in opposite 
orientation; (lane d) a marker (1-kb ladder; 
Bethesda Research Laboratories); and (other 
lanes) Plasmid DNAs isolated from 20 individual 
kanamycin-resistant colonies obtained by 
transformation of DH5a with the vector ligated to 
N-ras insert Sfil fragments. 

Fig. 5. PDGF receptor clones isolated 
from the ApCEV9-M426 cDNA library. Panels (A) 
and (B) , cDNA clones encoding for p and a PDGF 
receptors, respectively. The structure of each 
PDGF receptor cDNA is schematically shown with 
restriction sites. Open boxes represent coding 
sequences, while non-coding sequences are shown 
by bars. The clones shown by thick lines were 
isolated from the M426 cDNA library. The thin 
lines represent clones isolated from other 
libraries as described (Matsui,T., et al., 1989, 
Science 243, 800-804): HB15, HB3, and HB6 were 
derived from the human brain stem cell cDNA 
library in Agtll (provided by R. Lazzarini; 
Matsui et al. , 1989, supra); HF1 from the 
Okayama-Berg human fibroblast cDNA library 
(Okayama and Berg, 1982) ; and EF17 from a 
randomly-primed M426 cDNA library in Agtll 
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(Matsui t al. , 1989, supra) . Panels (C) and 
(D) i nucleotide sequences of 5 '-untranslated 

r gions of 0 and a PDGF receptor cl n s HPR5 and 

TR4 , r spectiv ly. Sequencing was performed by 
5 the chain termination method (Sanger et al., 

1977, Proc. Natl. Acad. Sci. USA 74, 5463-5467). 

The initiation codons are underlined. 

Figure 6. Structure of the cDNA 
cloning-expression vector \ pCEV27 . 

10 Structure of the ApCEV27 genome is shown 

at the upper half with the location of 1 genes. 
The plasmid part is enlarged and shown at the 
lower half as a circular map. The multiple 
excision site (MES) contains the restriction 

15 sites for infrequent cutters; NotI, Xhol, Pvul, 
and Mlul. The multiple cloning site (MCS) 
contains the restriction sites for BamHI, sail, 
Sfil(A), EcoRI, Bglll, Hindlll, Sfil(B), Sail, 
and B9t£II, and was placed in the clockwise 

20 orientation. The two Sfil sites are used to 
insert cDNA molecules by the automatic 
directional cloning method (Miki, T., et al. 
(1989) Gene 83, 137-146.), and the two Sail sites 
are used to release the inserts. SP6-P and T7-P 

25 represent the phage promoters for SP6 and T7 RNA 
polymerases, respectively. The trp-lac fused 
promoter tab and SV40 early promoter are used to 
express , the neo structural gene in E. coli 
(kanamycin resistance) and eukaryotic cells 

30 (6-418 resistance) , respectively. The directions 
of transcription from the promoters are shown by 
the arrows. Polyadenylation signals are labeled 
as polyA. Tfte locations of the replication 
origins (orj.) and the ampicillin resistant gene 

35 (amp) are shown. 



39 

Figure 7. Strategies for expression cloning of 
transf rming gene cDNAs. 

Cells of NIH 3T3 are transfected by 
APCEV27 cDNA library DNA* Transf rmed cells are 
isolated fr m Induced foci and assayed for G-418 
resistance and colony formation in soft agar. 
Cells are expanded and the genomic DNA isolated. 
The DNA is digested by either NotI, Xhol or Mlul, 
and then ligated in a low concentration. A 
bacterial strain is transformed by the ligated 
DNA and colonies resistant to both the ampicillin 
and kanamycin are isolated. Plasmid DNA is 
extracted from each colony and used to transfect 
NIH 3T3 cells to examine focus formation. Since 
cDNA library- induced foci is presumed to contain 
multiple cDNA clones, transforming plasmids are 
identified in this transfection assay. 

Figure 8. Detection of hepl cDNA inserts in the 
ST18-2 cDNA library-indcuced transf ormants. 

The genomic DNA from individual 
transf ormants (from CT18-1A to CT18-1G) were 
digested by Sail which can release the cDNA 
inserts (see Figure 6) . The digested DNA (5 mg) 
was separated on a 0.5% agarose gel by 
electrophoresis and transferred to a supported 
nitrocellulose membrane (Nitrocellulose GTG, FMC 
BioProducts) . The Southern blot was probed by 
the hepl cDNA insert of pHEPl-B which was rescued 
from the transf ormant T18-B. The NIH 3T3 genomic 
DNA was used as a negative control. The location 
of each fragment of the molecular size marker (1 
kb ladder, BRL) is shown at the right side in kb. 

Figure 9. Sequence homology of the hepl and 
B-raf gene cDNAs • 
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A restrict! n map f the cDNA insert of 
pHEPl-B was schematically shown at the top (a) . 
The regions vhere nucle tide seguenc was 
determined are shown by arr ws and labeled by A 
5 and B. The sequences are shown below ( b) . 
Computer analysis was performed by 
Intel liGenetics. programs. B-raf sequence was 
taken from and numbered as in Ikawa et al. 
(1988) • 

10 Figure 10 . Rearrangement and amplif ication of 
the hepl locus in the primary and secondary 
transformants. 

Th§: sources of DNA and restriction 
enzymes usecl are shown at the top. The strains 

15 PT-l and PT-2 are the primary transformants 

induced by the original tumor DNA. The strain 
18-1 was a secondary transformant induced by PT-2 
DNA and is the source of the cDNA library. The 
genomic DNA (5 Dig) was digested by Sail to 

20 release the cDftA inserts, separated on a 0.5% 

agarose gel by electrophoresis, and transferred 
to a supported nitrocellulose membrane. The 
Southern blot was probed by the hepl cDNA insert 
of pHEPl-B. The NIH 3T3 genomic DNA was used as 

25 a negative control. The location of several 
fragments of the molecular size markers (1 kb 
ladder and high molecular weight DNA marker, BRL) 
are shown at the right side in kb. 

Figure 11. Detection of mRNAs for the bral and 

30 B-raf genes. 

The poly (A) * RNA extracted from the cells 
indicated were denatured and separated on a 
formaldehyde gel. RNA was transferred to a 
supported nitrocellulose membrane and probed by 

35 each probe. The 5' probe was isolated as the 
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Sall-Hindlll fragment (see Pigur 8). The 3' 
probe was prepared by polymerase chain reaction 

(PCR) using GeneAmp kit (Cetus Co.) from CT18-2B 
g n mic DNA. The B-raf primer 

(5'-CCTCGAGATTCAAGTGATGAC-3 ') and the T7 prim r 
(5 • -CTAATACGACT CACTATAGGGG-3 * ) were used for PCR 

and the amplified fragment was purified from an 

agarose gel. 

FIG. 12 Cell morphology of control 
NIH/3T3 and transformants induced by keratinocyte 
cONA expresion library. 

NIH/3T3 cells (A) and NIH/3T3 cells 
transfected with the ectl (B) , ect2 (C) , or ect3 
(D) at 21 days post-transf ection . Cells were 
maintained in Dulbecco's modified Eagle's medium 
(DMEM) containing 5% calf serum. (x 180) 

FIG. 13 Specific binding of [ ,M I]-KGF to 
BALB/MK, NIH/3T3, and NIH/ectl, NIH/ect2 , and 
NIH/ect3 transformants. 

Methods: Recombinant KGF was radiolabeled with 
['"I]-Na by the chloramine-T method as described 
previously. Confluent cultures in 24-well plates 
were serum-starved for 24 h, followed by 
incubation with HEPES binding buffer (HBB; 100 mM 
HEPES, 150 mM NaCl, 5 mM KC1, 1.2 mM MgSO., 8.8 mM 
dextrose r 2 mg/ml heparin, and 0.1% bsa, pH 7.4) 
containing [ ,M I]-KGF for 1 h at 22«c. The cells 
were then washed with cold PBS, lysed with 0.5% 
SDS, and cell-associated radioactivity was 
measured in a gamma counter. Bound cpm were 
normalized to the cell protein content of SDS 
extracts. Specific binding was determined by 
subtracting normalized cpm of samples incubated 
with 100-fold excess unlabeled KGF from the 



normaliz d cpxa b und in the pr s nee f [ ,M I]-KGF 
al ne. 

FIG. 14 DNA and RNA analysis of the 
ectl sequence. a, Southern analysis of the 
Sail-digested DNAs from NIH/3T3 and its 
transformants. The blot was probed with the 
entire ectl cDNA insert. Since Sail is an 
infrequent cutter of mammalian DNA, most of the 
DNA fragments are extremely large and migrate 
near the origin of the gel. However, the cDNA 
inserts released by Sail from the vector are 
shorter and migrate into the gel allowing the 
detection of the insert without detection of the 
endogenous ectl gene. 

b. Southern analysis of EcoRI -digested 
DNAs of different animal species (Clontech, Palo 
Alto, CA) . The blot was probed with the 5' -half 
of the ectl cDNA insert and washed under reduced 
stringency conditions. 

c, Northern analysis of NIH/3T3 and 
BALB/MK RNA.y The blot was probed with the 
5* -half of the ectl cDNA (lanes 1 and 2) or 
b-actin cDNA (lanes 3 and 4) and washed under 
stringent conditions. 

Methods: For plasmid rescue, genomic DNA was 
cleaved by one of the infrequent cutters which 
can release the plasmids containing cDNA inserts. 
Digested DNA was ligated under diluted conditions 
and used to transform bacterial competent cells. 
Plasmids were isolated from ampicillin- and 
kanamycin-resistant transf ormants and used to 
transfect NIH/3T3 cells to examine for focus 
formation. The ectl plasmid was rescued by Xhol, 
while the ect2 and ect3 plasmids were rescued by 
NotI digestion. For Southern analysis, DNA (10 
Mg) was digested by Sail (Panel a) or EcoRI 
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(Panel b) , f racrti nated by agaros g 1 
el ctroph res is, and transferred to a 
nylon-supported nitrocellulose paper 
(Nitrocellul se-GTG, FMC, Rockland, ME). Th 
blot in Panel a was hybridized with the 
32 P-labeled entire ectl insert at 42° C and washed 
at 65 °C in o.l x SSC, while the blot in Panel b 
was hybridized with the "P-labeled 5 • -ectl probe 
(see Fig. 15b) at 37° c and washed at 55° c in 
0.1 x SSC. Location of DNA molecular weight 
markers (BRL, Gaithersburg, MD) is indicated in 
kb. For Northern analysis (Panel c) , poly(A)*RNA 
(5 Mg each) was fractionated by formaldehyde gel, 
transferred to Nitrocellulose GTG, and hybridized 
with the 5» ectl probe (lanes l and 2). After 
autoradiography, the filter was boiled to remove 
the probe and then hybridized with a b-actin 
probe (Gunning et al Molec. Cell Biol. 3, 787- 
795 (1983)) (lanes 3 and 4). Location of 
molecular weight markers (BRL, Gaithersburg, MD) 
is indicated in kb. 

All hybridization experiments were 
performed at the indicated temperature in a 
solution containing 50% formamide, 5 x SSC, 2.5 x 
Denhardt's solution, 7 mM Tris-HCl (pH 7.5), 0.1 
mg/ml of denatured calf thymus DNA, and 0.1 mg/ml 
of tRNA. 

FIG, 15 Nucleotide sequence of the KGF receptor 
cDNA. 

a. Nucleotide sequence and deduced 
amino acid sequence of the coding region of the 
KGF receptor cDNA. Nucleotides are numbered 
from the 5 '-end of the cDNA. Initiation and 
termination codons are underlined. Amino acids 
are numbered from the putative initiation site of 
translation and shown above the amino acid 
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sequence. Potential sites of N- linked 
glyc sylatioh ar indicated by dots abov the 
residues. The potential signal peptid and 
trans -membrane domains are und r lined. Th 
5 interkinase domain is shown by underlined italic 
letters. Glycine residues considered to be 
involved in ^ ATP binding are indicated by 
asterisks. Cysteine residues delimit two 
immunoglobulin-like domains in the extra-cellular 

10 portion of the molecule are shown by : over the 
residues. Nucleotide sequence was determined by 
the chain termination method (Sanger et al PNAS 
74, 5463-5467 (1977)). 

b, Structural comparison of the 

15 predicted KGF and bFGF receptors. The region 
used as a probe for Southern and Northern 
analysis (Fig. 14b and c) is indicated. The 
region homologous to ^the published bek sequence 31 
is also shown* The schematic structure of the 

20 KGF receptor' is shown below the restriction map 
of the cDNA clone. Amino acid sequence 
similarities with .the smaller and larger bFGF 
receptor variants are indicated. S, signal 
peptide; IGl, IG2, and IG3, immunoglobulin-like 

25 domains; A, acidic region;, TM, transmembrane 
domain; JM, juxtamembrane domain; TK1 and TK2, 
tyrosine kinase domains; IK, interkinase domain; 
C, c-terminus domain. Amino, acid sequence 
comparison was performed using the method of 

30 Pearson and Lipman (Pearson et al, PNAS 85, 2444- 
2448 (1988) V. 

FIG. 16 Competition of KGF, aFGF, and bFGF for 
['"I] -KGF binding on BALB/HK cells (A) and 
NZH/ectl cells (B) . 
35 Methods: Binding assays were performed as 

described for Fig. 14, except that cells were 
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incubat d with [ ,a5 I]-K6F in the presenc of 
unlabeled KGF, aFGF or bFGF at concentrations 
indicated n the x-axis. For Scatchard analysis, 
samples contained several c ncentrati ns f 
5 [ ,39 I]-KGF (1-100 ng/ml) in the presence or absence 
of a 100-fold excess unlabeled KGF, and were also 
processed as outlined in Fig. 13. Estimates of 
receptor affinity and total binding capacity were 
made using LIGAND software (Munson et al, Anal. 
10 Biochem. 107, 220-239 (1980)). 

FXG. 17 a, Covalent affinity crosslinking of 
[ ,M I]-KGF to BALB/MK (left), NIH/3T3 (center), and 
NIH/ectl cultures (right) . The left and center 
panels of this autoradiogram were exposed to 

15 Kodak XAR film for 72 h at -70'C; the right panel 
is an 18 h exposure of the same autoradiogram. 
The second lane for each cell type shows 
crosslinking performed in the presence of excess 
unlabeled KGF. Molecular weight markers are 

20 indicated on the left; the positions of 

[ ,n I]-KGF-cross linked complexes are indicated by 
arrows • 

b, Autoradiogram of 
phosphotyrosyl -proteins from intact NIH/3T3 

25 (left) and NIH/ectl cells (right) before and 
after treatment with KGF. Molecular weight 
markers are indicated on the left; the estimated 
molecular weights of proteins displaying 
KGF-stimulated phosphorylation on tyrosine are 

30 shown at right. 

Methods: Samples for covalent crosslinking were 
prepared from confluent, senna-starved cultures 
in 6 cm dishes, using 10 ng/ml [ ,a5 I]-KGF in the* 
presence or absence of 30-fold excess KGF. After 

35 binding (as described for Fig. 17), crosslinking 
with disuccinimidyl suberate was performed as 
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described previously. The cells w r th n 
scraped into cold HBB c ntaining 0 . ImM apr tlnin 
and 1.0 mM phenylmethylsulfonylf luoride, and a 
crude membrane fracti n was generated by brief 
5 sonlcation (50 W, 10 sec) , low-speed 

centrifugation (600 x g, 10 min), and high-speed 
centrifugation (100,000 x g, 30 min) of the 
low-speed supernatant. The membrane pellet was 
solubilized in Laemmli sample buffer (Laemmli, 

10 Nature 227, 680-685 (1970), containing 100 mM 
dithiothreitol, and boiled for 3 min. 
[ ,as I] -labeled proteins were resolved by 7.5% 
SDS-PAGE and autoradiography of the dried gel. 
Analysis of phpsphoproteins was performed as 

15 follows. Confluent cultures in 10 cm dishes were 
serum-staryed for 24 h, then treated with (+) or 
without (-) KGF (30 ng/ml) for 10 min at 37°C. 
The medium was aspirated, and the cells were 
solubilized in cold HEPES buffer containing 1% 

20 Triton X-100, protease- and 

phosphatase- inhibitors as described previously. 
The lysate was cleared by centrifugation, and 
phosphotyrosyl proteins were immunoprecipitated 
with affinity-purified anti-Ptyr adsorbed to 

25 GammaBind G-agarose (Genex, Gaithersburg, MD) . 

Phosphotyrosyl proteins were specifically eluted 
using 50 mM phenyl phosphate, diluted in Laeromli 
sample buffer, and resolved by 7.5% SDS-PAGE. 
Proteins were then transferred to nitrocellulose 

30 and detected with anti-Ptyr and [ ,2, I]-protein-A as 
described previously. 

raSCRTPTTON OF SPECIFIC EMBODIMENTS 

In one aspect, the present invention 
relates to a vector having nonidentical non- 
35 symmetrical restriction enzyme recognition site 
sequences, as described above, also including 
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regulatory lements located such that the 
segu nces of an inserted DNA segment are 
transcribed, where the regulatory elements are at 
least partly of eukary tic origin. A principal 
embodiment of this aspect of the pres nt 
invention is exemplified by two A-plasmid 
composite vectors, ApCEVIS and ApCEV9, the 
structures of which are depicted in Figure 1 and 
described further below, in Example 1. 

To examine the performance of the ADC 
method , a model H-ras insert was prepared so as 
to have SfiI(A~) and SfiI(B + ) ends (Fig. 4A) and 
ligated with the pCEVlS Sfll fragment. To show 
the difference between the ADC method and the 
"forced" cloning method using two different 
restriction enzymes, a similar H-ras fragment 
with EcoRI and Apal ends was prepared (Fig. 4 A) 
and ligated with the pCEVIS Ecom/Apal fragment. 
To measure the efficiency of cDNA cloning using a 
natural template, 2.5 mg of a poly (A) + RNA 
preparation was denatured by heating and used to 
synthesize cDNA from a linker-primer. The 
results of all these experiments, described in 
Example 2, below, illustrate the remarkable 
efficiency of cloning of model inserts using this 
novel method of the present invention. 

To assess the performance of the cDNA 
cloning method, cDNA was synthesized using 
poly (A) RNA extracted from M426 human embryonic 
lung fibroblast cells under the conditions 
described in Example 2, below. cDNA molecules 
larger than 1 kb were selected by low melting 
point agarose gel electrophoresis, and two 
aliquots were used to clone into ApCEV9 and 
ApCEVIS. The average size of the cDNA inserts 
was 2.0 kb in the ApCEV9 library (6 x 10* 
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ind pendent:- clones) and 2.2 kb in the ApCEVIS 
library (1 x 10 7 independ nt clones). 

To character iz the M426 cONA library in 
ApCEV9^ further, it w&s scr ened f'r th human 0 
5 PDGF receptor cDNA. Before this cDNA library was 
constructed, clones isolated (HB3 and HB15) by 
screening several other libraries did not contain 
the entire coding sequence (Fig. 5A) . When a 
part (9 x 10 5 pfu) of the H426 cONA library was 

10 screened for the human 0 PDGF receptor, six 

clones were isolated. Of these, three contained 
inserts of approximately 5 kb. Sequence analysis 
showed that two (HPR2 and HPR5) contained the 
entire coding sequence (Fig. 5 A) • Recently, 

15 Matsui et al., 1989, supra) have identified the 
cDNA of a novel PDGF receptor, designated the a 
PDGF receptor by isolation of overlapping cDNA 
clones (HF1, HB6 and EF17 in Fig. 5B) . 
Re-probing of filters from M426 cDNA library for 

20 the human a PDGF receptor resulted in the 

isolation of 93 clones. Of 7 clones analyzed, 5 
including TR4 contained inserts of 6.4 kb, which 
corresponded to the size of the message (Fig. 
5B) • As shown in Figs. 5C and 5D, sequence 

25 analysis of a and 0 receptor cDKAs isolated from 
the M426 library revealed 5 V - untranslated 
sequences followed by initiation codons for the 
complete coding sequence of each gene. These 
results indicated that the cDNA cloning system 

30 described here suitable for isolation of 
relatively long cDNAs. 

This method has been used for more than 
one year in the laboratory of these inventors , 
without public disclosure, to construct several 

35 cDNA libraries. Screening of the libraries for 
growth factors and receptors has been performed, 
and in most, cases cDNA clones containing the 
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ntixe coding sequ nee have been obtained as 
singl cl nea. F r example, a number of cDNA 
clones nc ding k rat in cyte gr wth factor have 
been isolated from the ApCEV9-M426 cDNA library 
using ligonucleotide probes. Sere ning f an 
MCF7 cDNA library in ApCEV9 constructed by the 
ADC method for a novel erbB-related gene resulted 
in the isolation of the cDNA clones of 5 kb with 
high frequency as well. All of these findings 
indicate that the ADC method using ApCEV vectors 
makes it possible to clone relatively long cDNAs 
very efficiently. 

From the present data, the following 
conclusions may be drawn: 

(1) The ADC procedure resulted in very 
high cloning efficiency ( 10 7 ~10 8 clones//ig of 
mRNA) • 

(2) Usually backgrounds of libraries 
constructed by this method are very low. When 
the vector arms are prepared carefully, almost 
all of the clones contain cDNA inserts. 

(3) cDNAs are inserted into the vectors 
in a directed orientation and as single inserts, 
making analysis of cDNA inserts simple and 
straightforward. 

(4) The vectors can accommodate longer 
inserts than other A vectors without sacrificing 
cloning efficiency, making possible to clone 
relatively long cDNA fragments. 

(5) Plasmids carrying cDNA inserts can 
be released from A genomes by Wot I cleavage. 
This feature facilitates the structural analysis 
of cDNA clones, permits the generation of 
size-selected plasmid sub-libraries, and makes it 
possible to recover the cDNA clones by plasmid 
rescue from eukaryotic cells. 
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The inv ntors have further refined the 
vectors t allow high levels f cDNA expr ssion 
in mammalian cells and th ability to perform 
plasmid rescue. Ex amp 1 3 tests the potential of 
5 the improved approach by its application to the 
isolation and characterization of unknown 
oncogenes from hepatocellular carcinomas of the 
B*C,F, mouse strain, extensively utilized in long 
term carcinogenesis testing in the United states. 

10 Example 4 discloses the utility of the 

refined directional cDNA library expression 
vector for isolating the keratinocyte growth 
factor (KGF) receptor cDNA by creation of an 
autocrine transforming loop. This expression 

15 cloning approach was successful to identify and 
functionally clone the receptor for the new 
growth factor. 

Example l. Construction of ApCEVIS and ApCEV9. 

The following materials and methods were 
20 used in this and the subsequent examples, as 
needed* 

Restriction enzymes , DNA polymerases, T4 
DNA ligase, and T4 polynucleotide kinase were 
purchased from New England BioLabs, Bethesda 

25 Research Laboratories, and Boehringer Mannheim. 

M-MLV reverse transcriptase and RNaseH were from 
Bethesda Research Laboratories. Bacterial 
strain LE392; F~, hsdRSll (rkT mk~) supE44 supF58 
lacYl or D(lacIZY) 6 galK2 galT22 m&tBl trpR55 was 

30 used as a host of A. DH5a (Bethesda Research 
Laboratories) was used for bacterial 
transformation. NZY broth (lOg NZ amine, 5g 
NaCl, 5g Yeast extract in 1 1, pH 7.5) was used 
to grow bacterial strains. M42 6 is a human lung 

35 embryonic fibroblast cell line (Aaronson and 
Todaro, 1968, J. Virology 36, 254-261). 
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Olig nucleotides were synthesized by a Beckman 
System 1 DNA Synthesizer and purified by high 
perf nnance liquid chromatography. 
Oligonucle tides utilized had the foil wing 
sequences, 

#1 : GATCCGTCGACGGCCATTATGGCCAGAATTCTGGGCCCG , 
#2 : TCGACGGGCCCAGAATTCTGGCCATAATGGCCGTCGACG , 
#3 : AATTCAGGCCGCCTCGGCCAAGCTTAGATCTGGGCCCG , 
#4 : TCGACGGGCCCAGATCTAAGCTTGGCCGAGGCGGCCTG , 
#5 : TGGATGGATGG , 
#6 : CCATCCATCCATAA , 

#7 and #8 : GGACAGGCCGAGGCGGCC (T) n , where n=*20 or 
40 in the case of #7 and #8, respectively. 

Plasmid DNA was prepared by the 
"selective precipitation procedure" which is a 
modification of the alkaline lysis method 
(Birnboim and Doly, 1979,^ Nucleic Acids Res. 7, 
1513-1523). This technique makes it possible to 
prepare sufficient pure plasmid DNAs to analyze 
and alter structures, without a requirement for 
lysozyme treatment, phenol extraction or repeated 
ethanol precipitations. Cells collected from a 
10 ml culture were resuspended into 0.2 ml of TEG 
(25 mM Tris-HCl pH 8.0/10 mM EDTA/50 mM 
glucose) . After transfer to a microcentrifuge 
tube, 0.2 ml each of 2% sodium dodecyl sulfate, 
and .0.4 M NaOH was added, mixed, and incubated at 
room temperature for 5 min. After the addition 
of 0.2 ml of 3M ammonium acetate (pH 4.8), 
incubation at 0 c for 10 min, and centrifugation 
for 15 min in a microcentrifuge, the supernatant 
was transferred to a fresh tube containing 0.2 ml 
of 2 M Tris-HCl (pH 8.9) and 2 ml of 2 mg/ml 
RNase A. Following incubation at 37 C for 3 0 min 
and centrifugation, the supernatant was 
transferred to a new tube containing 0.6 ml cold 
isopropanol. The tube was inverted several 
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times and incubated at r m temperature f r 10 
min. DNA was collected by centrifugati n and 
then washed with 75% ethanol. The p llet was 
dri d by incubation at 37 C for 5 min and 
5 resuspended into 50 ml of 10 mM Tris-HCl (pH 
8*0) , 1 mM BDTA. 

The A DNAs prepared as follows were used 
to modify the structure, analyze cDNA clones in 
ApCEV vectors and then to rescue the plasmid 

10 part. Host cells grown in 10 ml of NZY medium 
containing 2 mM MgCl 2 were suspended into the 
same volume" of SM buffer (50 mM Tris-HCl, pH 
7*5/8 mM MgSO 4 /l00 mM NaCl/0.01 % gelatin). A 
single plaque picked by a pasteur pipet was 

15 incubated with 0.1 ml of the host cell suspension 
at 37 C for 30 min. Ten ml of pre-warmed NZY 
broth containing 2 mM MgCl 2 was added and shaken 
at 37 C for 6 h. This procedure allows 
single-step production of high-titer lysates. 

20 Phage particles were precipitated and DNAs 

prepared as described by Arber et al, 1983 [In 
Hendrix, R. W. , Roberts, J. W. , Stahl, F. W. and 
Weisberg, R. A. (Eds.). Lambda II. Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY, 1983, 

25 pp. 433-466], with several modifications. A few 
drops of chloroform were added to the lysate, 
mixed and debris was removed by centrifugation. 
After the chloroform remaining in the lysate was 
removed by incubation at 37 C, 50 1 each of DNase 

30 I (1 mg/ml) and RNase A (1 mg/ml) were added, and 
incubated at 37 C for 1 h. Phage particles were 
precipitated tfy the addition of 5 ml of 30% 
PEG/3 M NaCl/10 mM MgCl 2 followed by incubation 
on ice for 1 h. Phage particles were collected 

35 by centrifugation at 3,000 rpm for 30 min., and 
the pellet was resuspended in 0.5 ml of SM. The 
suspension was transferred to a fresh 
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microcentrifuge tube containing 20 1 of 
proteinas K (20 mg/ml) . After incubation at 
37 C for 15 min, 50 1 of 100 mM Tris-HCl (pH8.0) 
/ lOOmM EDTA/ 1% sodium dodecylsulf ate was added 
and the tube was incubated at 65 G for 3 0 min. 
The released DNA was extracted by 
phenol/chloroform and then by chloroform. DNA 
was precipitated by 0.6 volume of isopropanol 
with 0.3 volume of 7.5 M ammonium acetate, washed 
with 75% ethanol and dried. The pellet was 
dissolved in 10 mM Tris-HCl (pH 8.0), 1 mM EDTA, 
0.1 mg/ml of RNase A and incubated at 37 C for 30 
min to digest ribosomal RNAs. 

Plasmid DNAs prepared by the selective 
precipitation method were directly used to modify 
the structures, by restriction enzyme digestions, 
repair reactions, and/or ligation with synthetic 
linkers. DNA fragments were separated on agarose 
gels and then purified using GENECLEAN (BIO 101 
Inc. ) . 

Insertion of oligonucleotides was 
performed as follows. The two strands of 
non-phosphorylated oligonucleotides were annealed 
and ligated with plasmid DNA which has been 
digested with suitable restriction enzymes. One 
strand of the oligonucleotides which was not 
ligated (due to the 5* -OH structure) was removed 
by heating and then separated by agarose gel 
electrophoresis. The purified fragments were 
phosphorylated and then ligated. 

Plasmid pCEV9 was constructed as 
follows. A retroviral vector pDOL" (Korman et 
al w 1987, Proc. Natl. Acad. Sci. USA 84, 
2150-2154) was cleaved by Xbal and recircularized 
to remove the polyoma segment. The Clal site was 
converted to a tfotl site by linker insertion, and 
the EcoRZ site was removed by repair ligation. A 
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synthetic MCS linker consisting of 

lig nucleotides #1 and /2 (s e Example a, below) 
was inserted between the Sail and BamHI sit s # 
and the BclZ/BamHZ fragment of SV40 DNA (Bethesda 
Research Laboratories) containing its 
polyadenylation signal was inserted at the XhoZ 
site. These manipulations produced pCEV9. 

Agbll DNA (Young and Davis, 1983 , Proc. 
Natl. Acad. Sci. USA 80, 1194-1198) was ligated 
and then cleaved by EcoRI and XbaZ. The ends 
were repaired and tfotl-linkered, and then pC£V9 
DNA linearized by Not I was cloned into the A DNA 
to produce ApCEV9 . 

pC£V15 was constructed by modifying 
pC£V8 , which is a parental plasmid of pCEV9 and 
lacks the MCS linker. A tac promoter fragment 
(Pharmacia) and XhoZ linker were inserted in the 
Sail sites Sfil(B), Jfindlll, and Bglll sites 
were removed successively by restriction enzyme 
digestion, polymerase treatment, and subsequent 
ligation re^Ption. Removal of the SflZ site (on 
the SV40 replication origin) did not impair the 
SV40 early promoter activity. The MCS linker 
was inserted between the BamHI and Sail sites, 
and Sfil(B), Bindlll, and Bglll sites were 
introduced again by insertion of the MCS-2 linker 
(oligonucleotides #3 and #4, below) between the 
EcoKZ and Aj^al sites. The resulting plasmid 
pCEV15 was oloned in a new A vector constructed 
as follows. The segment spanning from the XhoZ 
site to the right cos end of ApCEV9 was replaced 
by the corresponding fragment of Acharon28 (Rimm 
et al., 1980, Gene 12, 301-309), to introduce the 
cl deletion (KH54) and remove three ffindlll 
sites. The resulting phage ApCEV9c DNA was 
ligated, cleaved by HindZZl, and then repaired. 
A NotZ linker was ligated to the repaired HindlZZ 
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ends and the DNA was cleav d by tfotl. The A arms 
were purif i d and ligated with Not I -digested 
pCEVIS DNA, t pr duce ApCEVlS. 

Example 2. Effic iency of cloning and orientation 
5 of model inserts. 

Preparation of A arms and the Sfil 
adaptor for all cloning experiments was performed 
as follows* A DNA was ligated to seal cohesive 
ends and then cleaved sequentially by SflZ and 

10 EcoRI. After phenol /chloroform extraction, 
ApCEV9 arms were purified by centrifugation 
through a 5-20% potassium acetate gradient 
(Maniatis et al., 1982). ApCEVlS arms were 
purified by passage through a Sephadex G-50 spin 

15 column (Boehringer Mannheim Biochem.). 

The Sfll adaptor was prepared as 
follows. About l nmol of oligonucleotides #5 and 
#6 were separately phosphorylated by T4 
polynucleotide kinase, mixed, heated at 80 °C for 

20 5 min and then slowly cooled to 4°C. 

RNA for making cDNAs was extracted and 
Poly (A) + RNA selected as described by Okayama 
et al. (1987). cDNA was synthesized essentially 
according to D'Alessio et al. (1987) with some 

25 modifications. About 2.5 mg of poly(A) + RNA in 
10 ml of H 2 0 was mixed with 0.5 ml of lOOmM 
methylmercuric hydroxide (MeHg) , and incubated at 
room temperature for 5 min, followed by addition 
of 0.5 ml of 1.4M b-mercaptoethanol. After 5 

30 min, 1.2 ml of RNasin (40 units /ml; Pr omega 

Biotech.), 17.8 ml of H 2 0, 10 ml of 5x FS buffer 
(250mM Tris-HCl, pH 8.3/375mM KC1/15 mM MgCl 2 /100 
mM dithiothreitol) , 2.5 ml of dNTP mixture (10 
mM each of dGTP, dATP, dTTP and dCTP) , 5 ml of 

35 linker-primer (oligonucleotide #8; 1 mg/ml) and 
2.5 ml of M-MLV reverse transcriptase (200 
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units/ml) were sequentially added, mix d f and 
incubated at 37 ° C for 1 h. The tube was chilled 
on ic and 290 ml of H 2 0, 7,5 ml of dNTP mix (10 
mM each of dGTP, dATP, dCTP, and TTP) , 40 ml f 
5 10X SS buffer (188 mM Tris-HCl, pH8.3 / 906 mM 
KC1 / 46 mM MgCl 2 / 38 mM DTT) , 10 ml of DNA 
polymerase I (1.25 units /ml) and 1.8 ml of RNase 
H (0.25 units /ml) were added, mixed and 
incubated at 16 °C for 2 h. The reaction mixture 

10 was heated at 70°C for 10 min, and 5 ml of T4 
DNA polymerase (1 unit/ml) was added and 
incubated at 37° C for 10 min. The reaction was 
terminated by the addition of 40 ml of . 0.25M 
EDTA, and the mixture was extracted by 

15 phenol/chloroform twice followed by chloroform 

twice. cDNA was ethanol-precipitated from 2.5 M 
ammonium acetate, washed, and then dried. The 
pellet was dissolved into 10 ml of H 2 0 and then 
4 ml of Sf±2 adaptor (0.8 mg/ml) , 4 ml of 5x 

20 ligation buffer (500 mM Tris-HCl, pH7. 6/100 mM 
MgCl 2 /10 mM ATP/lOmM dithiothreitol/ 50% (w/v) 
polyethylene glycol-8000) and 2 ml of T4 DNA 
ligase (1 unit/ml) were mixed and incubated at 
14 °C overnight. A 10 ml aliquot was then mixed 

25 with 1 ml of lOx Sf±Z buffer (100 mM Tris-HCl, 
pH 7.9 / 500 mM NaCl / 100 mM MgCl 2 / 60 mM 
b-mercaptoethanoi / 1 mg/ml bovine serum 
albumin), 2 ml of Sfll (10 units/ml) and 7 ml of 
H 2 0« Digestion was performed at 50 C for 1 h« 

30 cDNA fragments were purified by low-melting point 
agarose gel electrophoresis or passing through a 
spun column (Maniatis et al., Molecular Cloning, 
Cold Spring Harbor 1982) packed with Sepharose 
CL-4B (Pharmacia) . An aliquot of the cDNA 

35 preparation Jwas then mixed with ApCEV vector 

arms and ethanol-precipitated. DNA was dissolved 
in 8 ml of^H 2 0, and then 1 ml of lOx 
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kinase-ligase buffer (660 mM Tris-HCl, pH7. 5/100 
mM MgCl 2 /50 mM dithiothreit 1, 500 mM ATP) and 1 
ml f T4 DNA ligas (l unit/ml) were added, 
mixed, and incubated at 14 °C overnight. In 
vitro packaging was perform d using GigaPack Gold 

(StrataGene) as directed. 

As shown in Fig. 4B, neither of the 
vector nor H-ras insert Sf±X fragment was 
self-ligated, while self-ligation occurred when 
the EcoRI/Apal fragments were used instead. In 
the ADC system, ligation occurred only when both 
the vector and insert fragments were present in 
the reaction mixture (Fig. 4B) . To characterize 
the directional cloning capacity of the system, 
the H-ras insert and vector Sfll fragments were 
ligated, used to transform an E. coli strain 
DH5a, and 20 kanamycin-resistant colonies were 
analyzed. As shown in Fig. 4C, all plasmids 
contained single inserts in the expected 
orientation, indicating that the ADC method 
provides both directional cloning and positive 
selection for the presence of inserts. To 
further examine the performance of the ADC method 
using the A system, model inserts prepared to 
have Sf±l(h~) and SfiI(B + ) ends were ligated with 
ApCEV9 arms (see Fig. 2C) . As shown in Table I, 

pCEV9 arms alone did not produce active phages 
efficiently even when the ligation reaction was 
carried out, while presence of model inserts in 
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Table X . * Packaging efficiency of ApCEV9 DNA 



DNA Titer a 
5 (pfu//xg A arms) 

ApCEV9 arms 1 x 10 3 

ApCEV9 arms, ligated 8 x 10' 

ApCEV9 arms + Insert A b , ligated 8 x 10* 

10 ApCEV9 arms + Insert B c , ligated 8 x 10* 

Footnotes for Table I: 

a The reaction mixture contained 66 mM Tris-HCl 
(pH7.5), 10 mM MgCl 2f 5 mM dithiothreitol, 50 mM ATP, and 0.1 
15 Mg/ml of DNA. Incubation was performed at 14 °C overnight, and 
the phage were produced by in vitro packaging and titered on 
LE392. 

b A 2 kb DNA fragment having the SfiI(A~) and 
SfiI(B + ) ends (see Fig. 2). 
2 0 c A DNA fragment similar to the insert A, except that 

the £TII(A~) end was created by ligation of the S^il adaptor. 
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th ligation mixture increased th titer of active 
phages by three orders of magnitude. These r suits 
indicated that succ ssful ligation and phage 
pr pagati n depended n the presence f the model 
5 insert in the react i n mixture. All f thse 
findings indicated that the cloning procedure 
results in low background and efficient directional 
cloning. 

To measure the efficiency of cDNA cloning 
10 using a natural template, 2.5 mg of a poly (A) + RNA 
preparation was denatured by heating and used to 
synthesize cDNA from a linker-primer 

(oligonucleotide #7). The cDNA was blunt-ended, and 
the sril adaptor was ligated to both the ends. The 

15 molecules were cleaved partially by Sfil, and then 
cloned in ApCEV9. A total of 2.5 x 10 8 plaque 
forming units (pfu) was obtained, indicating that 
the method was extremely efficient. 

Since Sfil is an infrequent cutter, almost 

20 all cDNA species in the libraries constructed by the 
ADC method should remain intact. Nonetheless, 
cDNAs containing Sfil sites might be excluded from 
our cDNA libraries. To solve the problem, partial 
Sfil digestion is usually performed. An alternative 

25 strategy involves cDNA synthesis from a 

linker-primer containing the recognition site of 
another infrequent cutter Mlul, in addition to the 
Sfil site. The cDNA preparation could then be 
divided into two parts, one cleaved by Sfil and the 

30 other by Aflul. A short oligonucleotide ligated to 
the cDNAs could be utilized to convert the Mlul end 
to an SfiI(B + ) end. 

Example 3. Isolation of a Mouse Hepatoma Oncogene 
cDNA Using a Novel Phenotypic Expression Cloning 
35 System. 
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The following protoc Is /method logi s ar 
r f erred t An. sections 3.1 — 3.6 bel w. 

cDNA Library Construction 

cDNA libraries were constructed as 
5 described above except the use of newly-designed 
adaptor and vector. The new Sfil adaptor does not 
contain the ATG codon in the sense strand and was 
consisted of two oligonucleotides, 
5 » -CCAATCGCGACC-3 ■• and 5 1 -GGTCGCGATTGGTAA-3 • . 
10 Amplif icatioi* of the library and preparation of the 
DNA were performed by the standard procedures 
(Maniatis et al. 1982). 

Cell Culture and DNA transfection 

AljL cells used were the derivatives of NIH 

15 3T3 (Jainchill, J.I*. , et al. (1969) J". Virol, p. 

549.). Calcium phosphate transfection (Wigler, M. 
et al . , (1977) Cb11.11, 223.) was used to introduce 
DNA into cells. Cells were maintained in Dulbecco's 
modified Eagle's medium (DMEM) containing 5% calf 

20 serum. 

Plasmid Rescue ~ 

The genomic DNA (1.2 jig) isolated from 
CT18-2B or CT18-2C was cleaved by Xhol, extracted 
with phenol-chloroform and then chloroform. The DNA 

25 was precipitated and resuspended in 3 55 pi of H 2 0. 
Fifty /il of 10 x kinase-ligase buffer (660 mM 
Tris-HCl, pH7.5/100 mM MgCl,/50 mM dithiothreitol, 
500 MM ATP) and 5 units of T4 DNA ligase (BRL) were 
added and incubated overnight at 15 °C. The ligated 

3 0 DNA was extracted and precipitated as above and then 
resuspended in 10 /si of TE. Four aliguotes (100 til 
each) of PLK-F 1 competent cells (Stratagene) were 
transformed by 0.5, 1, 2, and 4 ill of the ligated 
DNA as directed by the manufacturer. After the heat 
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shock, th cells were dileted 10 fold with S.O.C. 
m dium (BRL) containing 1 mM IPTG t induce 

xpression of the neo gene drived by the tac 
pr m ter. The culture was incubated for 2 h with 
shaking and plated on NZY hard agar c ntaining 
ampicillin (100 /jg/ml) , kanamycin (25 mg/ml) # and 
IPTG (100 /iM/ml) . 

Recombinant DNA Techniques 

Preparation of 1 and plasmid DNA was 
performed as described above. Genomic ONA was 
extracted by the standard procedure (Maniatis et 
al., 1982). Total RNA was isolated and 
poly (A) -selected as described by (Okayama et al. 
(1987) PNAS 84, 8573) . 

Southern and Northern Analysis 

DNA fragments were isolated by Geneclean 
(BIO101 Inc.) and labeled by random priming using 
Oligo Labeling kit (Pharmacia) . Hybridizations were 
performed as described (Kraus et al., 1987). 

3.1 Development of an Efficient Stable Expression 
cDNA Cloning System 

The > \pCEV27 system was developed to clone 
cDNAs by means of stable phenotypic changes induced 
by a specific cDNA. Use of a ^-plasmid composite 
vector made it possible to generate high complexity 
cDNA libraries and to efficiently excise the plasmid 
from the stably integrated phagemid DNA. This 
phagemid vector (Figure 6) contained several 
features including two Sfil sites for construction 
of cDNA libraries using the automatic directional 
cloning (ADC) method, an M-MLV LTR promoter suitable 
for cDNA expression in mammalian cells, the SV40 
promoter-driven neo gene as a selectable marker, and 
multiple excision sites (MESs) for plasmid rescue 
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fr m genomic DNA. The X~pCEV27 system incorp rated, 
in addition t the M-MLV LTR, the rat preproinsulin 
p lyadenylati n (polyA) signal d wnstream from the 
cDNA cloning site (Fig. 6), In this v ctor, the 
bacterial neo gene was placed under the independent 
control of the SV40 early promoter and the SV40 late 
polyA signal for use in marker selection in 
mammalian cells. In contrast to ApCEVIS, the bona 
fide promoter of the neo gene was removed so as to 
fuse the SV40 promoter directly to the neo 
structural gene. Thus, in ApCEVIS, expression of 
the neo gene in E. coll was achieved by 
transcription from the trp-lac fused promoter tac, 
inserted upstream from the SV40 early promoter (Fig. 
6) • By use of the tac promoter, it was possible to 
utilize IPTG-inducible selection for kanamycin 
resistance. Finally, the fl replication origin and, 
SP6 and T7 phage promoters were included to 
facillitate analysis of cDNA inserts by production 
of single-stranded DNA and RNA transcripts, 
respectively (Fig. 6) . 

The strategy for expression cloning of 
oncogene cOHAs is summarized in Figure 7. When 
library cDNA is used to transfect mammalian cells, 
cDNA clones are integrated with recombination 
between 1 DNA and host genomic DNA. For plasmid 
rescue, genomic DNA extracted from transformant is 
subjected to digestion with an enzyme which can 
cleave the ^ -plasmid junctions. The resulting DNA 
can then be circularized and used for bacterial 
transformation. For this purpose, the sites for two 
additional infrequent cutters, Xhol and Mlul, were 
included along with the NotI site. Because of the 
second selectable marker in bacterial cells, the 
ampicillin resistance gene (amp) , it was possible to 
select transformed E. coli cells resistant to both 
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ampicillin and kanamycin, avoiding selection f 
truncated plasmid clones. 

3.2 Characterization of Oncogenes Activat d in 
House Liver Tumors 

We have previously analyzed hepatocellular 
tumors of the mouse strain B6C3F1 for the presence 
of activated oncogenes (Reynolds et al. (1987) 
Science 237, 1309-1316). Although the majority were 
activated ras or c-raf oncogenes, four could not be 
identified. The sources of these oncogenes were 
tumors designated OT4, OT18, OT23, and OT28. One 
(OT23) was spontaneously generated, while the others 
were associated with chronic furfural exposure 
(Reynolds, S.H., et al., (1987)). Genomic DNAs of 
NIH/3T3 transformants containing each of the 
unidentified oncogenes were examined under low 
stringency hybridization conditions using a number 
of known and potential oncogene probes including 
abl, myb, ets, fos, fgr, fms, rel, src, sis, yes, 
p53, ros, PDGF-A, met, dbl, lskT, myc, N-myc, rho, 
mos, erbA, pin, lea, H-ras, N-ras, K-ras, c-raf, and 
erbB-2. None showed DNA fragments with either 
increased intensity or abnormal sizes relative to 
those detected in NIH/3T3 control DNA (data not 
shown) . Thus, none of these transforming genes 
appeared to be closely related to any of the genes 
used as probes. 

3.3 Expression cDNA Cloning of an Oncogene of a 
Furfural-induced Hepatoma 

Using the / \pCEV27 expression cloning 
system, we attempted to clone transforming cDNA 
from one furfural-induced tumor, 0T18. A cDNA 
library (3 x 10* independent clones) was constructed 
from poly (A)* RNA extracted from a secondary 
transformant of the tumor. Transfection of 80 
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plates of NIH/3T3 calls with 5 /ig/plate f the 
expression cpKA library led to the detect! n of 
sev n foci, which demonstrated G-418 resistance. 
These results indicated that each had taken up and 
stably integrated vector DNA, making it likely that 
the transformed foci were induced by exogenous cDNA 
rather than arising as a result of spontaneous 
transformation. 

. Two of these transf onaants , designated 
CT18-2B and CT18-2C, were selected for plasmid 
rescue. By restriction mapping of several distinct 
plasmids obtained from each transf ormant, it was 
possible to establish that one plasmid rescued from 
each had ttxe^. identical insert (data not; shown) . 
These results suggested that this cDNA might encode 
the oncogene product. Transf ection analysis of each 
rescued plasmid DNA demonstrated that these same two 
cDNA clones possessed high-titered transforming 
activity of around 10* ffu/nmol DNA, while none of 
the other plasmids rescued from the same 
transf ormantS showed detectable activity. 

To determine whether other trans f ormant s 
induced by -the cDNA library contained the OT18 
oncogene, genomic DNA extracted from each primary 
transfcarmant was digested with Sail to release cDNA 
inserts from the vector and subjected to Southern 
blotting analysis using the OT18 cDNA insert as a 
probe. Since Sail is an infrequent cutter for 
mammalian DNA # genomic DNA was cleaved to very large 
fragments which remained near the origin of the gel. 
Thus, the relatively small cDNA fragments released 
from cellular DNAs by Sail cleavage could be 
separated from the bulk of genomic fragments. As 
shown in Figure 8, each of the cDNA library 
transf ormants contained the 0T18 oncogene cDNA 
insert. The -iBizes ranged from around 2.3 kb to 7 
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kb, suggesting that serveral of the ins rts 
represent d independent cONA clon s of the oncogene. 

3.4 Structural Analysis of T18 oncogene cDNA 

A detailed restriction map of the 2.1 kb 
insert of one of the transforming plasmids was 
constructed, and the clone was subjected to sequence 
analysis. A database search indicated that that the 
5* portion of the cONA contained an unknown 
sequence, while the 3 " region was closely related to 
human B-raf (Ikawa et al., (1988) Mol. Cell. Biol. 
8, 2651-2654), and chicken Rmil (Marx et al., (1988) 
EMBO 7, 3369-3373) genes (Figure 9a). Comparison of 
the predicted amino acid sequences with that of B- 
raf indicated identity (Figure 9) with the exception 
of a single amino acid difference at position 324, 
in which Gly was substituted for Ala in human B-raf. 

There was also complete identity with 
avian R-mil except for a small stretch of nine amino 
acids at the R-mil C-terminus, where recombination 
with an avian retroviral env gene caused this 
substitution. 

To determine the breakpoint, we also 
compared the T18 nucleotide sequence with that of 
proto B-raf and v-R-mil. There was no homology with 
either sequence upstream from position 1040 in the 
T18 oncogene. Thus, position 1040 represents the 
junction between an unknown sequence and the B-raf 
gene. R-rail is a viral onocogene and encodes a gag- 
R-mll-env fusion protein. The junction of gag and 
R-mil has been mapped 144 nucleotide upstream from 
the T-18 break point (Marx et al . , 1988), while the 
junction of a different sequence and the human B- 
raf oncogene was 174 nucleotides upstream from the 
junction in the T18 oncogene. In each of the B-raf 
oncogenes, including T18, the breakpoints did not 
disrupt the predicted kinase domain of the protein. 
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3.5 Evidence £ r in Vivo Oncogene Activation 

Th human B-raf gene product is around 84% 
r lated t the aminp acid sequ nee of the c-raf 
oncogene. Another member of the raf family, A-raf , 

5 is also structurally similar. Most raf oncogenes 

i* ■ * 

have been activated by mechanisms involving 
structural rearrangements due to recombination and 
loss of amino terminal sequences of the raf coding 
sequence (Rapp, U.R. , et al., (1988) In: Reddy, E.P. 

10 (ed.) The Oncogene Handbook, Elsevier Science 

Publishers B>V.). Moreover, most reported instances 
have involved in vitro activation of these genes 
during DN A trans feet ion rather than by mechanisms 
leading to oncogene activation within the tumor 

15 itself (Ishikawa, F., et. al. , (1987) T. Mol. Cell. 
Biol., 7, 1226) ; (Stanton, V.P. and Cooper, G.M. 

(1987) Mol Cell. Biol. 7, 1171); (Ikawa, S., et al., 

(1988) Mol.. Cell. Biol. 8, 2651-2654). The 
rearrangement activating the B-raf oncogene might 

20 have occurred during the course of cDNA library 

construction or as an artifact of DNA transfection 
with the original tumor DNA. Alternatively, the 
oncogene might have been activated within the 
original turgor itself. 

25 While original tumor DNA was not 

available, it was possible to analyze two primary 
transf ormants which had been independently induced 
by this tumor DNA. As shown in Figure 10, 
rearrangement as well as amplification of both 

30 non-B-raf and related B-raf portions of the gene 

were found not only in the secondary transf ectant, 
which was the source of the cDNA library, but in 
both primary transf ormants, PT18-1 and PT18-2 • 
Since such in vi£ro rearrangements are very rare, 

35 these findings strongly argue that the oncogene was 
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activated in viv in the hepat ma as part f th 
neoplastic pr cess. 

3.6 Detection of the mRNAs for the T18 Oncogen • 
In an effort to characterize the B-raf 
oncogene transcript and search for evidence of 
additional B-raf oncogenes among the 3 other 
hepatoma oncogenes, we subjected polyA selected RNAs 
from primary or secondary NIH/3T3 transfectants 
containing each oncogene to analysis with DMA probe 
from 5" (B-raf unrelated) and 3 1 (B-raf) portions of 
the T18 oncogene. As shown in Fig. 11, control 
NIH/3T3 cells contained a 4.2 kb RNA that hybridized 
with the 5 1 non-B-raf related portion of the 
oncogene but there was no detectable B-raf 
transcript* in contrast, the second cycle T18 
transfectant, which was the source of expression 
cDNA library, showed a major 4.2-kb as well as minor 
10- and 3-kb transcripts which appeared to hybridize 
with both probes. 

Fig. li further shows that a primary T23 
oncogene transfectant contained multiple B-raf 
hybridizing transcripts, indicating that it also 
contained another B-raf oncogene. Of note, the 
several transcripts detected differed in their sizes 
from those of the T18 oncogene. Moreover, none of 
these transcripts was detected by the B-raf probe of 
the-T18 oncogene (Fig. 11). Thus, if the T23 
oncogene arose by a mechanism involving B-raf gene 
rearrangement, this rearrangement was different from 
that associated with activation of the T18 oncogene. 
The transfectant induced by the T28 oncogene (Fig. 
11) did not show abnortmal B-raf hybridizing RNAs, 
arguing that this oncogene must be distinct from B- 
rat. 
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Example 4. cDSJV Expression CI ning of the 
Keratinocyte Growth Factor Recept r by 
Complementation for Autocrine Transformation 

4.1 Identification of epithelial cell cDNAs capable 
5 of transforming NIH/3T3 cells. 

We prepared a cDNA library from BALB/MK 
epidermal karat inocytes (Weissman, B.E. & Aaronson, 
S.A. Cell 32, 599-606 (1983)) using the automatic 
directional cloning (Miki, T. et al., Gene 83:137- 

10 146, (1989)) in ^improved expression vector 

lpCEV27 (Miki, unjiublished data). A library of 
4.5 x 10* independent clones was amplified, phage 
particles purified, and their DNA extracted. DNA 
transf ection of NIH/3T3 mouse embryo fibroblasts 

15 (Jainchill, J.L. et al., J. Virol 4, 549-553 

(1969)), which synthesize KGF, was performed by the 
calcium phosphate technique (Wigler, M. et al. Cell 
11, 223-232 (1977)). Individual plates were 
examined at 10-18 days for the appearance of 

20 transformed foci. We detected 15 foci among a total 
of 100 individual cultures transf ected with 5 mg 
library cDNA/plate. Each focus was tested and shown 
to be resistant 1;o G418, indicating that it 
contained integrated vector sequences. Three 

25 representative transf ormants were chosen for more 

detailed characterization based upon differences in 
their morphologies (Fig. 13). 

When we performed plasmid rescue, each 
transformant gave rise to at least 3 distinct cDNA 

30 clones as determined by physical mapping. To 

examine theiy biological activities, each clone was 
subjected to transf ection analysis on NIH/3T3 cells. 
A single dlone rescued from each transformant was 
found to possess high-titered transforming activity 

35 ranging from 10 3 -10 4 focus forming units/nmole DNA. 
Moreover, the morphology of foci induced by each 
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cDNA was similar t that of the par ntal 
trans f ormant. Because of their distinct physical 
maps and distinguishable biological properties, we 
tent at iv ly designated the gen s for these 
transf rming cDNAs as epithelial cell transforming 
(ect) 1, 2 and 3. Transf ectants induced by the 
individual transforming plasmids were utilized in 
subsequent analyses. 

4.2 Specific KGF binding by ectl-transf ormed cells. 

Suramin is an agent known to interfere 
with ligand-receptor interactions including the 
binding of PDGF (Fleming, T.P. et al, Proc. Natn. 
Acad. Sci. U.S.A. 86, 8063-8067 (1989); Belsholtz, 
C. et al. Proc. Natn. Acad. Sci. U.S.A. 83, 6440- 
6444 (1986)) and KGF with their respective 
receptors. When an ectl transf ectant was exposed to 
suramin, its proliferation was markedly inhibited, 
associated with reversion of the transformed 
phenotype (data not shown) . To further investigate 
the possibility that ectl might encode the KGF 
receptor, we performed binding studies with 
recombinant [ l25 I]-KGF as the tracer molecule. As 
shown in Fig. 14, BALB/MK cells demonstrated 
specific high affinity binding of [ ,as I]-KGF while 
there was no such binding detectable to NIH/3T3 
cells. Of note, expression of the ectl gene by 
NIH/3T3 cells resulted in the acquisition of 
3.5-fold more [ ,M I]-KGF binding sites than BALB/MK 
cells (Fig. 13), Under these same conditions, 
neither NIH/ect2 nor ect3 bound significant levels 
of the labelled growth factor. These results 
strongly suggested that ectl encoded the KGF 
receptor, whose introduction into NIH/3T3 cells had 
completed an autocrine transforming loop. 
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4,3 Molecular characterization of ectl. 

To char act eri z ctl, the transforming 
4.2kb cDNA released by Sail digestion was used as a 
mol cular probe to hybridize Sail restricted g nomic 
5 DNAs of NIH/3T3 as well as NIH/3T3 transf ectants 
containing ectl, ect2 or ect3. While the expected 
4.2kb DNA fragment was detected in the ectl 
transf ormant (Fig. 15a) , neither NIH/3T3 nor the 
other transfectants showed evidence of a Sail 

10 fragment hybridized by the cDNA insert. These 
results further argued that the ect2 and ect3 
represented independent transforming genes. When 
EcoRI was used to cleave normal mouse DNA, we 
observed several distinct ectl hybridizing DNA 

15 fragments, which reflected endogenous ectl sequences 
or closely related genes (Fig. 15b) . Ectl related 
sequences were also demonstrated in the DNAs of 
other species analyzed, including human, indicating 
its high degree of conservation in vertebrate 

20 evolution. 

When we analyzed expression of transcripts 
related to ectl in BALB/MK and NIH/3T3 cells, we 
observed a single transcript of around 4.2 kb in 
BALB/MK cells (Fig. 15c). Thus, our cDNA clone 

25 represented essentially the complete ectl 

transcript. In NIH/3T3 cells, a transcript of 
comparable size was only faintly detectable under 
relatively stringent hybridization conditions. We 
estimated that its expression was several fold lower 

30 than the level of the 4.2 transcript in BALB/MK 

cells. Thus, if this transcript were to represent 
ectl rather than a related gene, its expression was 
markedly lower in fibroblasts as compared to 
epithelial cells. 
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4.4 Ectl encodes a transmembrane tyrosine kinase of 
the FGF r cept r family. 

We next determined the nucleotide seguenc 
of the ectl A. 2Kb cDNA insert. Analysis of the 
predicted amino acid sequence revealed a long open 
reading frame of 2235 nucleotides (nucleotide 
position 562-2796) . Two methionine codons were 
found at nucleotide positions 619 and 676, 
respectively (Fig. 16a) . The second methionine 
codon perfectly matched the Kozak's Consensus for a 
translational initiator sequence (A/GCCATGG) (Kozak, 
M. Nucleic Acids Res. 15, 8125-8148 (1987)). 
Moreover, it was followed by a characteristic signal 
sequence of 21 residues of which 10 were identical 
to those of the putative signal peptide of the mouse 
bFGF receptor (Reid, H.H., et al., Proc. Natn. Acad. 
Sci U.S.A. 87, 1596-1600 (1990); Pasquale, E.B. & 
Singer, S.J. Proc. Natn. Acad. Sci. U.S.A. 86, 8722- 
8726 (1989); (Safran, A. et al. Oncogene 5, 635-643 
(1990)). Thus, it seems likely that the second ATG 
is the authentic initiation codon for the KGF 
receptor (KGFR) . If so, the receptor polypeptide 
would comprise 707 amino acids with a predicted size 
of 82.5 kd. 

The amino acid sequence of the KGFR 
predicted a transmembrane tyrosine kinase most 
closely related to the mouse bFGF receptor (bFGFR) . 
The percent similarity between both proteins is 
shown in Fig. 16b. The putative KGFR extracellular 
portion contained two immunoglobulin (IgG)-like 
domains, exhibiting 77% and 60% similarity with the 
IgG-like domains 2 and 3, respectively, of the mouse 
bFGFR. Recent studies have revealed a variant form 
of the bFGFR, whose extracellular domain also 
contains only these two corresponding IgG-like 
domains (Reid, H.H. , et al., Proc. Natn. Acad. Sci 
U.S.A. 87, 1596-1600 (1990). The sequence 
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N-t rminal to the first: IgG-like d main of the KGFR 
was 63 residues long in comparison to 88 residues 
found in th shorter form f the m use bFGFR. Both 
the chicken and mouse bFGFRs contain a s ries of 
eight consecutive acidic residues between the first 
and second IgG '-like domains (Re id, H.H. , et al., 
Proc. natn. Acad, Sci U.S.A. 87, 1596-1600 (1990); 
Pasguale, E.B. & Singer, S.J. Proc. Natn. Acad. Sci. 
U.S.A. 86, S722-3726 {1989); (Safran, A. et al. 
Oncogene 5, 635-643 (1990); (Lee, P.L. et al. 
Science 245, 57-60* (1989) ) . This sequence is even 
retained in thte shorter form of the mouse bFGFR, 
which lacks the first IgG-like domain (Fig. 16b) . 
However, the! KGFR did not contain such an acidic 
domain. Whether this reflects significant 
functional differences between these receptors 
remains to be determined. 

The intracellular portion of the KGFR was 
highly homologous to the bFGFR tyrosine kinase (Fig. 
16) . The cetvtral core of the catalytic domain was 
flanked by a relatively long juxtamembrane sequence, 
and the tyrosine kinase domain was split by a short 
insert of 14 residues, similar to that observed in 
avian, human aftd murine bFGF receptors (Reid, H.H., 
et al., Proc. natn. Acad, sci U.S.A. 87, 1596-1600 
(1990); Pasquale, E.B.. & Singer, S.J. Proc. Natn. 
Acad. Sci. U.S.A. 86, } 8722-8726 (1989); (Safran, A*, 
et al. Oncogene 5, 635-643 (1990); (Lee, P.L. et al. 
Science 245, 57-60 (1989); (Ruta, M. et al. Oncogene 
3, 9-15* (L988) ; and (Ruta, M. et al. Proc. Natn. 
Acad. Sci. U.S.A. 86, 5449-5434 (1989)). Hanafusa 
and co-workers isolated a partial cDNA encoding a 
tyrosine kinase, designated bek, by bacterial 
expression cloning using phosphotyrosine antibodies 
(Kornbluth, S., et al., Molec. Cell Biol. 8, 5541- 
5544 (1988)). The reported sequence of bek was 
identical to the KGFR in the tyrosine kinase domain 



73 

(Fig. 16b) . Although only partical sequence f bek 
is available, it is very likely to encode the mous 
KGF recept r. 

4.5 Functional analysis of the cloned KGF receptor. 
Because of the existence of more than one 
receptor of the FGF family (Reid, H.H., et al. Proc. 
Natn. Acad. Sci. U.S.A. 87, 1596-1600 (1990), we 
sought to characterize in detail the binding 
properties of the KGF receptor isolated by 
expression cloning. Scatchard analysis of ['"I] -KGF 
binding to the NIH/ectl transfectant revealed 
expression of two similar high affinity receptor 
populations. Out of a total of -3.8 x 10 5 
sites/cell, 40% displayed a Kd of 180 pM, while the 
remaining 60% showed a Kd of 480 pM (data not 
shown) . These values are comparable to the high 
affinity KGF receptors displayed by BALB/MK cells. 
The pattern of KGF and FGF competition for 
[ ,as I]-KGF binding to NIH/ectl cells was also very 
similar to that observed with BALB/MK cells (Fig. 
17). Although maximum ['"l]-KGF binding to NIH/ectl 
cells was 3.5 fold higher than to BALB/MK, there was 
50% displacement by 2 ng/ml of either KGF or aFGF 
with each cell type. Similarly, both cells, showed 
15-fold less efficient competition by bFGF for bound 
['*I]-KGF. Together with observations that parental 
NIH/3T3 cells lack detectable specif ic [ ,JS I]-KGF 
binding (Fig. 14) , these results demonstrate that 
the cloned KGF receptor expressed in NIH/3T3 cells 
conferred the characteristic pattern of KGF and FGF 
competition displayed by BALB/MK cells. 

When [ ,M X]-KGF is crosslinked to its 
receptors on BALB/MK cells, two protein species of 
165 and 137kd have been observed. Taking into 
account the size of KGF itself, we have estimated 
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the c rrespondJLng receptor species t be ar und 140 
and 115kd, respectively. When [ ,a9 I]-KGF crosslinking 
was performed with NIH/ectl cells, we obs rv d a 
single species c rresponding in size to the smaller, 
5 137kd species in BALB/MK cells (Fig. 17a). 

Moreover,, detection of this band was specifically 
and efficiently blocked by unlabelled KGF. When 
glycosylation is considered, the size of the KGF 
receptor predicted by sequence analysis corresponds 

10 reasonably well with the corrected size (115 kd) of 
the crosslinked KGF receptor in the ectl 
trans feet ant »; 

As a final test of the functional nature 
of the KGF receptor expressed in NIH/ectl cells, we 

15 investigated its capacity to induce tyrosine 

phosphorylation of cellular proteins. Thus, intact 
NIH/ectl cells were exposed to KGF for 10 min, and 
cell lysates were subjected to immunoprecipitation 
and immunoblotting analysis utilizing 

20 anti-phosphotyrosine (anti-Ptyr) antibody. As shown 
in Fig. 17b, v several putative substrates were 
tyrosine phosphorylated in response to KGF addition. 
These included pp55, pp65, pp90, ppllS, ppl50 and 
ppl90. Previous studies have indicated that similar 

25 size proteins are phosphorylated in response to KGF 
triggering of BALB/MK cells. Moreover, the 115-kd 
phosphoprotein matches the corrected size of the KGF 
receptor crosslinked by [ U5 I]-KGF. Thus, it may 
reflect the autophosphorylated KGF receptor itself. 

30 *********** 

For purposes of completing the 
background description and present disclosure, each 
of the published articles, patents and patent 
applications heretofore identified in this 
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specificati n are hereby incorporated by reference 
into the specification. 

The foregoing invent i n has been 
describ d in some detail for purposes f clarity and 
understanding. It will also be obvious that various 
combinations in form and detail can be made without 
departing from the scope of the invention. 

One skilled in the art will 
appreciate that the capacity for efficient rescue of 
cDNA clones from mammalian cells is an important 
function of a stable mammalian expression cloning 
system. When plasmid cDNA libraries are used to 
transfect mammalian cells, single plasmids 
integrated in genomic DNA are difficult to release. 
Plasmid rescue is readily achieved only when 
multiple copies are clustered at a single 
integration site (Noda et al, PNAS, 86, 162-166, 
1989) . Excision of the plasmid by induction of 
replication from the SV40 origin using COS cell 
fusion often results in rearrangement or truncation 
of cDNA clones efficiently from stable integration 
sites within mammalian host cells, Applicants used a 
strategy involving -plasmid composite vectors. The 
vectors contain plasmid excision sites for multiple 
cutters including Notl, MLul and Xhol. This allows 
the intact plasmid containing insert to be 
efficiently rescued with low probability of internal 
cleavage of the insert. 



WHAT IS CLAIMED IS: ' 



A g netic cloning vect r comprising 
at least: ne r plicon; and 
.. a site for inserting DNA segments to 

be cloned that includes at least two non-symmetrical 

restriction j^nzyme recognition sequences, wherein 

at least two of said non-symmetrical 

restriction enzyme recognition sequences are 

identical,' artd 

the first of said identical 
restriction enzyme recognition sequences is in the 
inverted orientation with respect to a second 
identical sequence, and 

said first and second identical 
restriction enzyme recognition sequences include 
greater than six positions having invariable DNA 
base pairs, 

2*. The vector according to claim l 
wherein said identical restriction enzyme 
recognition sequences can be cleaved by the 
restriction enzyme Sfil. 

*i> 

3. ji A genetic cloning vector comprising 

-at least one replicon; and 
: a site for inserting DNA segments to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences, wherein 

at least two of said non-symmetrical 
restriction enzyme recognition sequences are 
nonident ical • 

4. The vector according to claim 3 
wherein at least two of said non-symmetrical 
restriction enzyme recognition sequences that are 
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nonidentical can be cleaved by a sinjle restrict! n 
enzyme • 

5. The vector according t claim 4 
wher in said single restriction enzyme is th 
restriction enzyme BstXI. 

6. The vector according to claim 3 
wherein at least one of said non-symmetrical 
restriction enzyme recognition sequences includes 
greater than six positions having invariable DNA 
base pairs. 

7. The vector according to claim 6 
wherein at least one of said non-symmetrical 
restriction enzyme recognition sequences including 
greater than six positions having invariable DNA 
base pairs can be cleaved by the restriction enzyme 

8. The vector according to claim 3 
wherein said replicon comprises a form of 
bacteriophage A. 

9. The vector according to claim 3 
further comprising regulatory elements that are 
located in relation to said site for insertion of 
DNA segments such that, when a DNA segment is 
inserted into said site, at least a portion of the 
sequence of said DNA segment is transcribed. 

10. The vector according to claim 9 
wherein said regulatory elements consist of 
promoters that entirely originate from 
bacteriophage . 

11. The vector according to claim 10 



78 

wherein said vector is either LambdaGEM T *ll or 
I*ambdaGEM™12 . 

12 . Th vector according to claim 9 

wher in said regulatory elements are at least partly 
of eukaryotic origin. 

13. The vector according to claim 12 
wherein said vector is either ApCEVIS or ApCEV9 . 

14 • The vector according to claim 3 
further comprising a selectable marker that is 
functional in eukaryotic cells in which the vector 
can be replicated. 

15. A method for cloning a cDNA copy of a 
eukaryotic mRNA, comprising the steps of: 

(i) annealing a linker-primer DNA 
segment comprising a single-stranded oligonucleotide 
which comprises oligo(dT) at the 3" end, and a 
single-stranded extension at the 5* end that is 
included in a first non-symmetrical restriction 
enzyme recognition sequence; 

(ii) enzymatically synthesizing the 
first strand of said cDNA from the linker-primer 
that is annealed with said mRNA molecule; 

(iii) enzymatically synthesizing the 
second strand of said cDNA using said first strand 
as the template under conditions such that 
single-stranded extensions on the synthesized cDNA 
molecule are made double-stranded; 

(iv) ligating onto the blunt-ended 
cDNA resulting from synthesizing said second strand, 
an adaptor DNA segment comprising a second non- 
symmetrical restriction enzyme recognition sequence 
that is nonidentical to said first non-symmetrical 
restriction enzyme recognition sequence; 
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(v) xposing said cDNA resulting 
from ligati n with said adapt r to ne or more 
restricti n enzymes that can cleave said first and 
sec nd non-symmetrical restriction enzyme 
recognition sequences under c nditi ns such that 
both of said sequences are cleaved, resulting in 
said vector DNA having two single-stranded ends that 
are not complementary; 

(vi) ligating said cDNA resulting 
from cleavage with said enzymes to DNA of a genetic 
cloning vector, said vector comprising 

at least one replicon; and 
a site for inserting DNA segments to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences, wherein 

in said DNA of said vector, at least 
two of said non- symmetrical restriction enzyme 
recognition sequences have been cleaved by one or 
more enzymes that can cleave said recognition 
sequences, resulting in said vector DNA having two 
single-stranded ends that are not complementary; 
wherein further, 

one of said single-stranded ends on 
said vector DNA that has been cleaved has a sequence 
that is complementary to the singlerstranded 
extension on said linker-primer attached to said 
cDNA; and 

the other of said single-stranded 
ends on said vector DNA that has been cleaved has a 
sequence that is complementary to. the single- 
stranded extension on said adaptor attached to said 
cDNA; and 

(vli) transforming a suitable host 
cell with the recombinant DNA segment comprising 
said cDNA and said vector DNA that results from said 
ligation of cDNA to vector DNA; and 
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(viii) identifying a clone of host; 
cells, resulting fr m transformation with said 
rec mbinant DNA, that contains a recombinant DNA 
segment including said cDNA. 

16. ^ DNA segment having the sequence of 
a genetic cloning vector comprising 

at least one replicon; and 
a site for inserting DNA segments to 
be cloned that includes at least two nonidentical 
restriction enzyme recognition sequences that are 
non-symmetrical , 

troth of said nonidentical restriction 
enzyme recognition sequences having been cleaved by 
an enzyme br v enzymes that can cleave them. 

17. A reagent kit comprising a vector DNA 
segment according to claim 16 and further including: 

a linker-primer having a single- 
stranded end, tftat is complementary to one single- 
stranded end- of said vector DNA; and 

an adaptor which after cleavage by a 
suitable restriction enzyme, has a single-stranded 
end that is complementary to the other single- 
stranded end of ^aid vector DNA. 

18. A genetic cloning vector comprising 
at. least one replicon; 

a site for inserting a DNA segment to 
be cloned that includes at least two non-symmetrical 
restriction enzyme recognition sequences that can be 
cleaved by a single restriction enzyme; and 

^ at least two regulatory elements 
located in relation to said site for insertion such 
that, when a DNA segment is inserted into said site, 
transcription of said DNA segment can be effected in 
both the sense and antisense directions. 
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19. The cl ning vector acc rding t claim 

18 wherein said cloning vector is a phagemid and 
wherein said at least n replicon comprises at 
least one plasmid replicon and at least ne phage 

5 replicon. 

20 • The cloning vector according to claim 

19 further comprising a single stranded phage origin 
of replication. 

10 

21. The cloning vector according to claim 
18 wherein said vector is pCEV27. 



22. A DNA segment having a sequence that 
15 encodes the amino acid sequence shown in Figure 9b. 

23. The DNA segment according to claim 22 
wherein said segment has the sequence shown in 
Figure 9b. 

20 

24. A DNA segment encoding a keratinocyte 
growth factor receptor. 

25. The DNA segment according to claim 24 
25 wherein said receptor has the amino acid sequence 

given in Figure 15a, or allelic or species 
variations thereof. 
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26. A DNA construct comprising a vector 
and a DNA segment encoding a keratinocyte growth 
factor receptor. 



27. A DNA construct comprising a vector 
and a DNA segment encoding a keratinocyte growth 
35 factor receptor wherein the vector is the cloning 
vector according to claim 18. 
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28. The construct according to claim 26 
wher in the yect r is pCEV27. 

29. A host cell comprising the construct 
5 according to claim 26. 

30?; A keratinocyte growth factor receptor 
substantially free of proteins with which it is 
naturally associated. 

10 

31. The receptor according to claim 30 
wherein said receptor has a higher affinity for 
keratinocyte growth factor and acidic fibroblast 
growth factor than basic fibroblast growth factor. 

15 - 

32. The receptor according to claim 31 
wherein said receptor has the sequence shown in 
Figure 15a. 

20 33., A process of producing a keratinocyte 

growth factor receptor comprising culturing the 
cells according to claim 29 under conditions such 
that said DNA segment is expressed and said factor 
thereby produced. 

25 

34. A method of identifying the presence 
in a DNA sequence of a gene the protein product of 
which confers a phenotypically identifiable trait 
comprising: 

30 i) preparing a DNA expression 

library containing said DNA sequence in said cloning 
vector according to claim 9 or 18, in a manner such 
that said gene is represented in a said library; 

ii) introducing said library into a 

35 population of cells under conditions such that 
integration into the genome of said cells and 
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expreaai n f said gene are eff cted, so that: said 
phen typically identifiable trait is caused to b 
displayed; 

iii) isolating DNA fr m said cells of 
said populati n displaying said phenotypically 
identifiable trait; and 

iv) excising a DNA segment 
containing said gene from said integrated DNA. 

35. The method according to claim 34 
wherein said DNA sequence is a cDNA sequence. 

36. The method according to claim 34 
wherein said gene encodes a ligand a receptor for 
which is normally produced by cells of said 
population, said cells, prior to introduction of 
said gene, being incapable of producing said ligand. 

37. The method according to claim 34 
wherein said gene encodes a receptor, the ligand 
which binds thereto being normally produced by cells 
of said population, said cells, prior to 
introduction of said gene, being incapable of 
producing said receptor. 

38. The method according to claim 34 
wherein said phenotypically identifiable trait is 
uncontrolled proliferation. 

39. The method according to claim 37 
wherein said receptor is the keratinocyte growth 
factor receptor. 

40. The method according to claim 34 
wherein said phenotypically identifiable trait is 
drug resistance. 
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